1 Problem Statement
During a drill, we used Wireshark to capture a packet similar to the one shown below. How can we perform USB traffic analysis on it?
![USB traffic analysis](https://www.ids-sax2.com/wp-content/uploads/picture/ask-qcloudimg-com-x53v26e0jn.png)
2 Problem Analysis
2.1 How is the traffic packet captured?
First, from the packet above, we can deduce that it’s a USB traffic packet. We can start by analyzing how the USB packet is captured.
Before starting, let’s introduce some basic USB knowledge. USB has different specifications, and here are the three ways to use USB:
l USB UART
l USB HID
l USB Memory
UART or Universal Asynchronous Receiver/Transmitter. In this mode, the device simply uses USB for receiving and transmitting data without any other communication functions.
HID is a Human Interface Device. This type of communication is suitable for interactive equipment, which includes devices like keyboards, mice, game controllers, and digital display devices.
Lastly, it’s USB Memory, or data storage. External HDDs, thumb drives/flash drives, etc., fall into this category.
The most widely used among these is either USB HID or USB Memory.
Every USB device (especially HID or Memory) has a Vendor ID and a Product ID. The Vendor ID indicates which manufacturer made this USB device. The Product ID marks different products; it’s not necessarily a unique number, although unique numbers are preferred. As shown below.
![USB traffic analysis](https://www.ids-sax2.com/wp-content/uploads/picture/ask-qcloudimg-com-v6segr5x2c.png)
The image shows the list of USB devices connected to my computer in a virtual machine environment, viewed through the ‘lsusb’ command.
For instance, I have a wireless mouse in VMware. It belongs to a HID device. This device operates normally, and by this command, we can see all USB devices. Now can you identify which one is the mouse? That’s right, it’s the fourth one, shown here:
Bus 002 Device 002: ID 0e0f:0003 VMware, Inc. Virtual Mouse
Among them, ID 0e0f:0003 is the Vendor-Product Id pair, the value of Vendor Id is 0e0f, and the value of Product Id is 0003. Bus 002 Device 002 means that the USB device is connected normally, which needs to be recorded.
We use root privileges to run Wireshark to capture USB data streams. However, this is usually not recommended. We need to provide users sufficient permissions to access USB data streams in Linux. We can achieve this using udev. We need to create a user group called ‘usbmon’ and add our account to this group.
addgroup usbmon
gpasswd -a $USER usbmon
echo 'SUBSYSTEM=="usbmon", GROUP="usbmon", MODE="640"' > /etc/udev/rules.d/99-usbmon.rules
Next, we need the usbmon kernel module. If this module is not loaded, we can load it using the following command:
modprobe usbmon
Open Wireshark, and you’ll see usbmonX where X represents a number. Below is the result of our current analysis (I’m using root):
![](https://www.ids-sax2.com/wp-content/uploads/picture/ask-qcloudimg-com-bdbcakezq9.png)
If the interface is active or has data flows passing through, Wireshark will display it as a waveform chart. Which one should we select then? That’s right, the one I previously asked you to note down; this X number corresponds to the USB Bus. In this article, it’s usbmon0. Open it to observe the data packet.
![](https://www.ids-sax2.com/wp-content/uploads/picture/ask-qcloudimg-com-pqkt8h9foo.png)
Through this, we can understand the communication process and operating principles between the USB device and host, allowing us to analyze the traffic packet.
2.2 How to Analyze a USB Traffic Packet?
Based on the earlier foundation of knowledge, we have an outline on how to capture USB traffic packets. Next, we’ll introduce how to analyze a USB traffic packet.
Details of the USB protocol reference Wireshark’s wiki:
We’ll start with a simple example from GitHub:
![](https://www.ids-sax2.com/wp-content/uploads/picture/ask-qcloudimg-com-bbu5l6dblv.png)
Our analysis shows that the data part of the USB protocol is in the Leftover Capture Data field. In Mac and Linux, the tshark command can be used to extract the leftover capture data exclusively, as follows:
tshark -r example.pcap -T fields -e usb.capdata //If you want to import the usbdata.txt file, add the parameter at the end: >usbdata.txt
In Windows, under the Wireshark environment, there’s a tshark.exe in the Wireshark directory. For example, mine is located at D:\Program Files\Wireshark\tshark.exe
![](https://www.ids-sax2.com/wp-content/uploads/picture/ask-qcloudimg-com-5lcbwbn582.png)
Invoke cmd, navigate to the current directory, and enter the following command:
tshark.exe -r example.pcap -T fields -e usb.capdata //If you want to import the usbdata.txt file, add the parameter at the end: >usbdata.txt
For detailed use of the tshark command, refer to Wireshark’s official documentation: https://www.wireshark.org/docs/man-pages/tshark.html
Run the command and check usbdata.txt to find that the packet length is eight bytes.
![](https://www.ids-sax2.com/wp-content/uploads/picture/ask-qcloudimg-com-tao5gjffge.png)
Regarding the characteristics of USB applications, I found a diagram that clearly reflects this issue:
![](https://www.ids-sax2.com/wp-content/uploads/picture/ask-qcloudimg-com-griamia306.png)
Here, we only focus on the keyboard traffic and mouse traffic in USB traffic.
Keyboard packets have a data length of 8 bytes, with keystroke information concentrated in the 3rd byte. Each keystroke generates a keyboard event USB packet.
Mouse packets have a data length of 4 bytes, with the first byte representing the button. When it’s 0x00, it means no button; 0x01 indicates the left button, and 0x02 indicates the right button is currently pressed. The second byte can be seen as a signed byte type, where the highest bit is a sign bit. When positive, it indicates how many pixels the mouse moves right horizontally; when negative, it indicates how many pixels it moves left horizontally. The third byte is similar to the second byte and indicates vertical movement offset.
I reviewed a large number of USB protocol documents, and here we can find the correlation between this value and specific key positions:
USB keyboard mapping table. According to this mapping table, take out the third byte and decode it with the reference table:
![](https://www.ids-sax2.com/wp-content/uploads/picture/ask-qcloudimg-com-pblv4mq0mc.png)
We wrote the following script:
mappings = { 0x04:"A", 0x05:"B", 0x06:"C", 0x07:"D", 0x08:"E", 0x09:"F", 0x0A:"G", 0x0B:"H", 0x0C:"I", 0x0D:"J", 0x0E:"K", 0x0F:"L", 0x10:"M", 0x11:"N",0x12:"O", 0x13:"P", 0x14:"Q", 0x15:"R", 0x16:"S", 0x17:"T", 0x18:"U",0x19:"V", 0x1A:"W", 0x1B:"X", 0x1C:"Y", 0x1D:"Z", 0x1E:"1", 0x1F:"2", 0x20:"3", 0x21:"4", 0x22:"5", 0x23:"6", 0x24:"7", 0x25:"8", 0x26:"9", 0x27:"0", 0x28:"n", 0x2a:"[DEL]", 0X2B:" ", 0x2C:" ", 0x2D:"-", 0x2E:"=", 0x2F:"[", 0x30:"]", 0x31:"\\", 0x32:"~", 0x33:";", 0x34:"'", 0x36:",", 0x37:"." }
nums = []
keys = open('usbdata.txt')
for line in keys:
if line[0]!='0' or line[1]!='0' or line[3]!='0' or line[4]!='0' or line[9]!='0' or line[10]!='0' or line[12]!='0' or line[13]!='0' or line[15]!='0' or line[16]!='0' or line[18]!='0' or line[19]!='0' or line[21]!='0' or line[22]!='0':
continue
nums.append(int(line[6:8],16))
# 00:00:xx:....
keys.close()
output = ""
for n in nums:
if n == 0 :
continue
if n in mappings:
output += mappings[n]
else:
output += '[unknown]'
print('output :n' + output)
Result as follows:
![](https://www.ids-sax2.com/wp-content/uploads/picture/ask-qcloudimg-com-27lqr0azgo.png)
We integrated the previous into a script, resulting in:
Effect as follows:
#!/usr/bin/env python
import sys
import os
DataFileName = "usb.dat"
presses = []
normalKeys = {"04":"a", "05":"b", "06":"c", "07":"d", "08":"e", "09":"f", "0a":"g", "0b":"h", "0c":"i", "0d":"j", "0e":"k", "0f":"l", "10":"m", "11":"n", "12":"o", "13":"p", "14":"q", "15":"r", "16":"s", "17":"t", "18":"u", "19":"v", "1a":"w", "1b":"x", "1c":"y", "1d":"z","1e":"1", "1f":"2", "20":"3", "21":"4", "22":"5", "23":"6","24":"7","25":"8","26":"9","27":"0","28":"<RET>","29":"<ESC>","2a":"<DEL>", "2b":"\t","2c":"<SPACE>","2d":"-","2e":"=","2f":"[","30":"]","31":"\\","32":"<NON>","33":";","34":"'","35":"<GA>","36":",","37":".","38":"/","39":"<CAP>","3a":"<F1>","3b":"<F2>", "3c":"<F3>","3d":"<F4>","3e":"<F5>","3f":"<F6>","40":"<F7>","41":"<F8>","42":"<F9>","43":"<F10>","44":"<F11>","45":"<F12>"}
shiftKeys = {"04":"A", "05":"B", "06":"C", "07":"D", "08":"E", "09":"F", "0a":"G", "0b":"H", "0c":"I", "0d":"J", "0e":"K", "0f":"L", "10":"M", "11":"N", "12":"O", "13":"P", "14":"Q", "15":"R", "16":"S", "17":"T", "18":"U", "19":"V", "1a":"W", "1b":"X", "1c":"Y", "1d":"Z","1e":"!", "1f":"@", "20":"#", "21":"$", "22":"%", "23":"^","24":"&","25":"*","26":"(","27":")","28":"<RET>","29":"<ESC>","2a":"<DEL>", "2b":"\t","2c":"<SPACE>","2d":"_","2e":"+","2f":"{","30":"}","31":"|","32":"<NON>","33":"\"","34":":","35":"<GA>","36":"<","37":">","38":"?","39":"<CAP>","3a":"<F1>","3b":"<F2>", "3c":"<F3>","3d":"<F4>","3e":"<F5>","3f":"<F6>","40":"<F7>","41":"<F8>","42":"<F9>","43":"<F10>","44":"<F11>","45":"<F12>"}
def main():
# check argv
if len(sys.argv) != 2:
print "Usage : "
print " python UsbKeyboardHacker.py data.pcap"
print "Tips : "
print " To use this python script , you must install the tshark first."
print " You can use `sudo apt-get install tshark` to install it"
print "Author : "
print " Angel_Kitty <[email protected]>"
print " If you have any questions , please contact me by email."
print " Thank you for using."
exit(1)
# get argv
pcapFilePath = sys.argv[1]
# get data of pcap
os.system("tshark -r %s -T fields -e usb.capdata > %s" % (pcapFilePath, DataFileName))
# read data
with open(DataFileName, "r") as f:
for line in f:
presses.append(line[0:-1])
# handle
result = ""
for press in presses:
Bytes = press.split(":")
if Bytes[0] == "00":
if Bytes[2] != "00":
result += normalKeys[Bytes[2]]
elif Bytes[0] == "20": # shift key is pressed.
if Bytes[2] != "00":
result += shiftKeys[Bytes[2]]
else:
print "[-] Unknow Key : %s" % (Bytes[0])
print "[+] Found : %s" % (result)
# clean the temp data
os.system("rm ./%s" % (DataFileName))
if __name__ == "__main__":
main()
Effect as follows:
![](https://www.ids-sax2.com/wp-content/uploads/2024/12/image-2.png)
Additionally, here’s a mouse traffic packet conversion script:
nums = []
keys = open('usbdata.txt','r')
posx = 0
posy = 0
for line in keys:
if len(line) != 12 :
continue
x = int(line[3:5],16)
y = int(line[6:8],16)
if x > 127 :
x -= 256
if y > 127 :
y -= 256
posx += x
posy += y
btn_flag = int(line[0:2],16) # 1 for left , 2 for right , 0 for nothing
if btn_flag == 1 :
print posx , posy
keys.close()
The keyboard traffic packet conversion script is as follows:
nums=[0x66,0x30,0x39,0x65,0x35,0x34,0x63,0x31,0x62,0x61,0x64,0x32,0x78,0x33,0x38,0x6d,0x76,0x79,0x67,0x37,0x77,0x7a,0x6c,0x73,0x75,0x68,0x6b,0x69,0x6a,0x6e,0x6f,0x70]
s=''
for x in nums:
s+=chr(x)
print s
mappings = { 0x41:"A", 0x42:"B", 0x43:"C", 0x44:"D", 0x45:"E", 0x46:"F", 0x47:"G", 0x48:"H", 0x49:"I", 0x4a:"J", 0x4b:"K", 0x4c:"L", 0x4d:"M", 0x4e:"N",0x4f:"O", 0x50:"P", 0x51:"Q", 0x52:"R", 0x53:"S", 0x54:"T", 0x55:"U",0x56:"V", 0x57:"W", 0x58:"X", 0x59:"Y", 0x5a:"Z", 0x60:"0", 0x61:"1", 0x62:"2", 0x63:"3", 0x64:"4", 0x65:"5", 0x66:"6", 0x67:"7", 0x68:"8", 0x69:"9", 0x6a:"*", 0x6b:"+", 0X6c:"separator", 0x6d:"-", 0x6e:".", 0x6f:"/" }
output = ""
for n in nums:
if n == 0 :
continue
if n in mappings:
output += mappings[n]
else:
output += '[unknown]'
print 'output :\n' + output
The project link for the above example is: https://files.cnblogs.com/files/ECJTUACM-873284962/UsbKeyboardDataHacker.rar
Regarding the question we raised at the beginning, we can try a similar approach as in the example above:
First, we export all usb.capdata using tshark:
tshark -r task_AutoKey.pcapng -T fields -e usb.capdata //If you want to import the usbdata.txt file, add the parameter at the end: >usbdata.txt
![](https://www.ids-sax2.com/wp-content/uploads/picture/ask-qcloudimg-com-b8oeepgw8s.png)
We use the above Python script to extract the third byte, decode it with the reference table:
mappings = { 0x04:"A", 0x05:"B", 0x06:"C", 0x07:"D", 0x08:"E", 0x09:"F", 0x0A:"G", 0x0B:"H", 0x0C:"I", 0x0D:"J", 0x0E:"K", 0x0F:"L", 0x10:"M", 0x11:"N",0x12:"O", 0x13:"P", 0x14:"Q", 0x15:"R", 0x16:"S", 0x17:"T", 0x18:"U",0x19:"V", 0x1A:"W", 0x1B:"X", 0x1C:"Y", 0x1D:"Z", 0x1E:"1", 0x1F:"2", 0x20:"3", 0x21:"4", 0x22:"5", 0x23:"6", 0x24:"7", 0x25:"8", 0x26:"9", 0x27:"0", 0x28:"n", 0x2a:"[DEL]", 0X2B:" ", 0x2C:" ", 0x2D:"-", 0x2E:"=", 0x2F:"[", 0x30:"]", 0x31:"\\", 0x32:"~", 0x33:";", 0x34:"'", 0x36:",", 0x37:"." }
nums = []
keys = open('usbdata.txt')
for line in keys:
if line[0]!='0' or line[1]!='0' or line[3]!='0' or line[4]!='0' or line[9]!='0' or line[10]!='0' or line[12]!='0' or line[13]!='0' or line[15]!='0' or line[16]!='0' or line[18]!='0' or line[19]!='0' or line[21]!='0' or line[22]!='0':
continue
nums.append(int(line[6:8],16))
# 00:00:xx:....
keys.close()
output = ""
for n in nums:
if n == 0 :
continue
if n in mappings:
output += mappings[n]
else:
output += '[unknown]'
print('output :n' + output)
The result of the operation is as follows:
![](https://www.ids-sax2.com/wp-content/uploads/picture/ask-qcloudimg-com-q1jv7pidzc.png)
output :n[unknown]A[unknown]UTOKEY''.DECIPHER'[unknown]MPLRVFFCZEYOUJFJKYBXGZVDGQAURKXZOLKOLVTUFBLRNJESQITWAHXNSIJXPNMPLSHCJBTYHZEALOGVIAAISSPLFHLFSWFEHJNCRWHTINSMAMBVEXO[DEL]PZE[DEL]IZ'
We can see that this is automatic key decoding. The current question is how to decode it without knowing the key?
I found the following article on how to brute-force the key: http://www.practicalcryptography.com/cryptanalysis/stochastic-searching/cryptanalysis-autokey-cipher/
The brute-force script is below:
from ngram_score import ngram_score
from pycipher import Autokey
import re
from itertools import permutations
qgram = ngram_score('quadgrams.txt')
trigram = ngram_score('trigrams.txt')
ctext = 'MPLRVFFCZEYOUJFJKYBXGZVDGQAURKXZOLKOLVTUFBLRNJESQITWAHXNSIJXPNMPLSHCJBTYHZEALOGVIAAISSPLFHLFSWFEHJNCRWHTINSMAMBVEXPZIZ'
ctext = re.sub(r'[^A-Z]','',ctext.upper())
# keep a list of the N best things we have seen, discard anything else
class nbest(object):
def __init__(self,N=1000):
self.store = []
self.N = N
def add(self,item):
self.store.append(item)
self.store.sort(reverse=True)
self.store = self.store[:self.N]
def __getitem__(self,k):
return self.store[k]
def __len__(self):
return len(self.store)
#init
N=100
for KLEN in range(3,20):
rec = nbest(N)
for i in permutations('ABCDEFGHIJKLMNOPQRSTUVWXYZ',3):
key = ''.join(i) + 'A'*(KLEN-len(i))
pt = Autokey(key).decipher(ctext)
score = 0
for j in range(0,len(ctext),KLEN):
score += trigram.score(pt[j:j+3])
rec.add((score,''.join(i),pt[:30]))
next_rec = nbest(N)
for i in range(0,KLEN-3):
for k in xrange(N):
for c in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ':
key = rec[k][1] + c
fullkey = key + 'A'*(KLEN-len(key))
pt = Autokey(fullkey).decipher(ctext)
score = 0
for j in range(0,len(ctext),KLEN):
score += qgram.score(pt[j:j+len(key)])
next_rec.add((score,key,pt[:30]))
rec = next_rec
next_rec = nbest(N)
bestkey = rec[0][1]
pt = Autokey(bestkey).decipher(ctext)
bestscore = qgram.score(pt)
for i in range(N):
pt = Autokey(rec[i][1]).decipher(ctext)
score = qgram.score(pt)
if score > bestscore:
bestkey = rec[i][1]
bestscore = score
print bestscore,'autokey, klen',KLEN,':"'+bestkey+'",',Autokey(bestkey).decipher(ctext)
The resulting output is as follows:
![](https://www.ids-sax2.com/wp-content/uploads/picture/ask-qcloudimg-com-c2770eh9rt.png)
We saw the word ‘flag’, and organizing gives us the following:
-674.914569565 autokey, klen 8 :"FLAGHERE", HELLOBOYSANDGIRLSYOUARESOSMARTTHATYOUCANFINDTHEFLAGTHATIHIDEINTHEKEYBOARDPACKAGEFLAGISJHAWLZKEWXHNCDHSLWBAQJTUQZDXZQPF
We split the fields to see:
HELLO
BOYS
AND
GIRLS
YOU
ARE
SO
SMART
THAT
YOU
CAN
FIND
THE
FLAG
THAT
IH
IDE
IN
THE
KEY
BOARD
PACKAGE
FLAG
IS
JHAWLZKEWXHNCDHSLWBAQJTUQZDXZQPF
The final flag is flag{JHAWLZKEWXHNCDHSLWBAQJTUQZDXZQPF}