Skip to main content
ThreatConnect Acquires Polarity to Transform How Security Uses Intelligence
Press Release
Request a Demo

Playbook Fridays: Document Parsing and Keyword Scanning/Tagging

Automatically tag the documents with keywords and focused areas of interest without human intervention

ThreatConnect developed the Playbooks capability to help analysts automate time consuming and repetitive tasks so they can focus on what is most important. And in many cases, to ensure the analysis process can occur consistently and in real time, without human intervention.

This Playbook is actually a set of 3 playbooks: one that saves the keyword, one that is used to verify the data is saved and what the analyst expected; and the last one that actually performs the work.

Many customers have reached out and voiced frustration because analysts were spending a lot of time looking over various reports for specific keywords and then manually applying tags based upon those keywords. This act was getting very time consuming; especially for one customer, who had 10 separate focus areas with more than 200 different keywords.

With this Playbook set, analysts can automatically tag the documents with keywords and focused areas of interest without human intervention, saving the analyst about 4-5 hours/week.

This Playbook set is triggered with the creation of a document in a source (with a specific tag “parseme” that can be removed as requirement after verifying expected functionality). First, you can set the list of keywords from the datastore contained within ElasticSearch. Then, in JSON, you define a set of keywords and have them grouped and save them as variables. The main Playbook converts the document into a set of strings that is then passed onto the regex capture groups for comparison. For those keywords that match the Playbook, it will create the tag for the group, ie: China/Russia. Additionally, the Playbook will tag the document with the actual keywords within those that match, ie: APT12/APT28 etc.


1)  Import “Populate DataStore with Keywords.pbx
In this Playbook you set a JSON array with your keywords. There are a few examples already preconfigured out of the box to get you started. This playbook only needs to be ran once to populate the datastore (and any other time the list needs to be updated).


Populate DataStore with Keywords


2) Import Document Keyword Check.pbx
This playbook will need to be set to a specific owner to monitor, and as a safety measure, is currently configured to fire off the tag “parseme”. After verifying functionality this tag requirement can be omitted so that it runs each time a document is created.

Document Keyword Check

About the Author


By operationalizing threat and cyber risk intelligence, The ThreatConnect Platform changes the security operations battlefield, giving your team the advantage over the attackers. It enables you to maximize the efficacy and value of your threat intelligence and human knowledge, leveraging the native machine intelligence in the ThreatConnect Platform. Your team will maximize their impact, efficiency, and collaboration to become a proactive force in protecting the enterprise. Learn more at