tekCollect (formerly hashCollect)
Description: tekCollect started off as a tool to scrape md5 hashes from specified files and URLs. As further development occurred I realized that the program would serve more functional if it could pull out other data types besides MD5s, such as IP Addresses, URLs, SSNs, and more. With that in mind I modified the code to have some default searches such as the ones mentioned above. Additionally I added the abaility to search based on the users own custom regex.
There is much more planned for this tool. Expect to see database integration, more data types, and maybe even integration with other tools.
Current Versions is .4
Installation:
As this is a python script you will need to ensure you have the correct version of python, which for this script is python 2.7. I used mostly standard libraries, but just incase you don't have them, here are the libraries that are required: httplib2, re, sys, argparse
With the python and the libraries out of the way, you can simply use git to clone the tekdefense code to your local machine.
git clone https://github.com/1aN0rmus/TekDefense.git
If you don't have git installed you can simply download the script from https://github.com/1aN0rmus/TekDefense/blob/master/tekCollect.py
On linux, if you would like to run this as an executable (./) be sure to:
chmod +x tekCollect.py
Usage:
Like always let's start off with the help command:
root@bt:~/workspace/Automater# ./hashCollect.py -husage: hashCollect.py [-h] [-u URL] [-f FILE] [-o OUTPUT] [-r REGEX] [-t TYPE][-s]tekCollect is a tool that will scrape a file or website for specified dataoptional arguments:-h, --help show this help message and exit-u URL, --url URL This option is used to search for hashes on a website-f FILE, --file FILE This option is used to import a file that containshashes-o OUTPUT, --output OUTPUTThis option will output the results to a file.-r REGEX, --regex REGEXThis option allows the user to set a custom regexvalue. Must incase in single or double quotes.-t TYPE, --type TYPE This option allows a user to choose the type of datathey want to pull out. Currently MD5, SHA1, SHA 256,Domain, URL, IP4, IP6, CCN, SSN, EMAIL-s, --Summary This options will show a summary of the data types ina file
From the help command you will notice we have a few options when running this program. The only required options are that you must have a file (-f) or a URL (-u). If no data type (-t) is given, the program assumes that you want to find MD5 Hasshes.
To show you typical usage here are a few examples:
Search a file for MD5 Hashes
root@bt:~/workspace/Automater# ./tekCollect.py -f mixfile -t MD5
7df966c6c0af44219b30b45716cfec56
64978daa09e3a6bfeceef409a41dbe24
fa8781a5a53a0d7076349d68a6a441f8
601f5d4627ed4594e667ecde2b884d2e
c1469d2375f7f4d78c2fad38ff5d7c45
a0fa0df2499cd4bb0e82d3ac891b7fb4
89f4a6196dd019cd0dbce4d2c95b7dd0
277ecaf092c1eff0e8426b0913ab7205
7cfaf2f497299a6483ba8cc803d4f176
f83d0416e4a36e841cbb9b3da2047244
e4996d186d7882b3d6c1897de7b7df89
d67f7203b96797f32536e6a941c2477b
b3cc1bf9cdbf852bb5ed40de40bd88f0
2ff9f72e4f138b365863ecfd41d1b96d
3457c332baa5bf3cc198875ae4c5407b
6e8ed1ff10339c0714fb13679d519595
03b0ce80f93c0727fd283f05c143af9b
a9e081829ef6fad48b90dd2f5317c1f6
Search a URL for IP Addresses
root@bt:~/workspace/Automater# ./tekCollect.py -u http://minotauranalysis.com/malwarelist.aspx -t IP4
200.87.133.140
193.109.247.70
195.216.243.237
193.16.45.8
117.21.226.102
78.108.186.4
64.26.174.89
66.216.101.139
98.129.229.53
91.228.153.199
46.30.211.53
72.167.131.1
195.208.0.144
46.4.69.113
109.69.58.42
94.73.148.30
46.165.206.92
74.220.207.76
212.58.2.23
198.13.114.201
Search a URL for Email Addresses and output to a file
root@bt:~/workspace/Automater# ./tekCollect.py -u http://www.TekDefense.com/ -t EMAIL -o TekEmails.out
[+] Printing results to file: TekEmails.out
root@bt:~/workspace/Automater# cat TekEmails.out
1aN0rmus@TekDefense.com
Show a summary of the different types of data at a URL
root@bt:~/workspace/Automater# ./tekCollect.py -u http://www.Securabit.com/ -s
# of MD5 in the target: 0
# of SHA1 in the target: 0
# of SHA256 in the target: 0
# of DOMAIN in the target: 64
# of URL in the target: 20
# of IP4 in the target: 0
# of IP6 in the target: 0
# of SSN in the target: 0
# of EMAIL in the target: 0
# of CCN in the target: 6
Show a summary of the different types of data in a file
root@bt:~/workspace/Automater# ./tekCollect.py -f mixfile -s
# of MD5 in the target: 63
# of SHA1 in the target: 0
# of SHA256 in the target: 0
# of DOMAIN in the target: 48
# of URL in the target: 5
# of IP4 in the target: 2
# of IP6 in the target: 2
# of SSN in the target: 3
# of EMAIL in the target: 36
# of CCN in the target: 17
If you have any suggestions for the tool please let me know.