TY - GEN
T1 - Exposing the Rat in the Tunnel
T2 - 28th ACM SIGSAC Conference on Computer and Communications Security, CCS 2022
AU - Dodia, Priyanka
AU - Alsabah, Mashael
AU - Alrawi, Omar
AU - Wang, Tao
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/11/7
Y1 - 2022/11/7
N2 - Tor∼citetor is the most widely used anonymous communication network with millions of daily users∼citetormetrics. Since Tor provides server and client anonymity, hundreds of malware binaries found in the wild rely on it to hide their presence and hinder Command & Control (C&C) takedown operations. We believe Tor is a paramount tool enabling online freedom and privacy, and blocking it to defend against such malware is infeasible for both users and organizations. In this work, we present effective traffic analysis approaches that can accurately identify Tor-based malware communication. We collect hundreds of Tor-based malware binaries, execute and examine more than 47,000 active encrypted malware connections and compare them with benign browsing traffic. In addition to traditional traffic analysis features (which work at the connection level), we propose global host-level network features to capture peculiar malware communication fingerprints across host logs. Our experiments confirm that our models are able to detect "zero-day'' malware connections with 0.7% FPR even when malware connections constitute less than 5% of Tor traces in the test set. Using multi-labeling approaches, we are able to accurately detect the malware behavior-based classes (grayware, ransomware, etc). Finally, we evaluate the robustness of our models on real-world enterprise logs and show that the classifiers can identify infected hosts even with missing features.
AB - Tor∼citetor is the most widely used anonymous communication network with millions of daily users∼citetormetrics. Since Tor provides server and client anonymity, hundreds of malware binaries found in the wild rely on it to hide their presence and hinder Command & Control (C&C) takedown operations. We believe Tor is a paramount tool enabling online freedom and privacy, and blocking it to defend against such malware is infeasible for both users and organizations. In this work, we present effective traffic analysis approaches that can accurately identify Tor-based malware communication. We collect hundreds of Tor-based malware binaries, execute and examine more than 47,000 active encrypted malware connections and compare them with benign browsing traffic. In addition to traditional traffic analysis features (which work at the connection level), we propose global host-level network features to capture peculiar malware communication fingerprints across host logs. Our experiments confirm that our models are able to detect "zero-day'' malware connections with 0.7% FPR even when malware connections constitute less than 5% of Tor traces in the test set. Using multi-labeling approaches, we are able to accurately detect the malware behavior-based classes (grayware, ransomware, etc). Finally, we evaluate the robustness of our models on real-world enterprise logs and show that the classifiers can identify infected hosts even with missing features.
KW - malware
KW - tor
KW - traffic analysis
UR - http://www.scopus.com/inward/record.url?scp=85143063975&partnerID=8YFLogxK
U2 - 10.1145/3548606.3560604
DO - 10.1145/3548606.3560604
M3 - Conference contribution
AN - SCOPUS:85143063975
T3 - Proceedings of the ACM Conference on Computer and Communications Security
SP - 875
EP - 889
BT - CCS 2022 - Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security
PB - Association for Computing Machinery
Y2 - 7 November 2022 through 11 November 2022
ER -