This is the second of our three 'Automation Temptation' notes, sharing how we at Robotic Online Intelligence approach the automation of research. In Note #1, we started with the pure efficiency of the production of research reports, and now turn to market intelligence - building your army of robo-analysts...
Note #2 - Market Intelligence ‘M.A.S.T.E.R’: Model, Aggregate, Standardize, Tag, Evaluate, Release.
By ‘market intelligence’ we mean here the external unstructured textual or numerical information that can be significant to your decisions and judgments.
For B2B information businesses, media and in a corporate and investment research context, automating the aggregation and filtering of such market intelligence for internal use can save valuable hours or days of research time and potentially deliver an information advantage.
In practice, market intelligence could mean tracking new product or fund launch or strategy announcements, new policies, the analysts' opinions about the market outlook, central banks’ comments on inflation, new data on prices, sales, the revealed transaction or project details, or crypto listings. Our open access Covid-19 Global Impact Tracker from March 2020, for instance, is fully automated. Or how we encoded China Property to pick signals from local sources.
We believe you don’t need millions or billions of sources, but rather the select few hundred (or thousand) that really matter, and the human experts are able to define these.
B2B INFORMATION BUSINESSES, MEDIA
Databasing unstructured information into a useful format has always been a core part of the B2B information business and there is room for big efficiency gains there through automation that goes beyond extracting data from documents, namely focusing on pre-selection of what the analysts spend time on.
For media editorial operations, automating the scans of local sources, private companies, or sectors with highly fragmented disclosure can yield more valuable pickups, improve cross-checks, and lead to greater speed in production.
The chart below illustrates the information types and sources and the flow to human analysts who then transform the myriad of data into databases served to company’s clients. Some of our customer deployments show that it is possible to save 30% to 50% of analysts' time by eliminating irrelevant information from the incoming pool of content while losing only 0.3%-3% of the 'good items'.
If you consider data collection and databasing as a form of a modern 'manufacturing' plant, and if you have more than a few people on it, the automation of reporting on key performance metrics can enhance the supervision of such operations.
In our internal use case on a smaller scale, for the databases of Projects and Deals in the data center sector, we scan info from over 2,000 sources, looking at company announcements, government approvals, local project news, via websites, SEC filings, tweets, across 100 companies, 4 languages. The analysts can then review the ‘potential inputs into database’ and do their more complex human part. Once there is a database record, the system will keep checking for any likely relevant additional info.
BROADER MARKET INTELLIGENCE
Getting a good handle on the significant external signals and sources relevant to your specific domain can be a source of a competitive edge, but for the intelligence system to be truly useful, it needs to be customized.
The problem with any standardized solutions is that they may not fit a particular sector or geography you are focused on and are unlikely to give your firm an edge over competitors using the same thing.
In real estate domain, what if you only want to focus on e.g. potentially distressed property developers in China?
The need to customize is at the heart of our approach, where we believe in the domain- or vertical- specific build. Not an all-encompassing 'consistent' classification across domains but rather modelling the domain by domain, such as China housing, cryptocurrencies, data centers, logistics, private equity funds, or central banks.
It's simply hard to configure a semantic system that would be both internally consistent and deeply relevant to specific domains at the same time.
OUR QUEST FOR A 'SYSTEM'
Originally designed for our own China research business, since 2017 we have been developing in close collaboration with clients a highly configurable market intelligence engine Kubro™ - a workflow platform (SaaS) to help data companies:
a) automate manual tasks of the analysts who collect data;
b) help the management to supervise the modern 'data factories';
c) develop new information products for their clients, powered by the Kubro™ engine at the back-end.
Any domain and language, with no coding required.
Our broader framework is illustrated on the chart below. The main components are the overall model of the domain, aggregation and standardization of information from various sources, tagging into topic models, evaluation of significance through scoring rules and the release of the outputs in suitable formats (email, PDF, web, API).
The automation kicks in at the aggregation and search stage and plays a crucial role in classification, filtering and publishing, where both AI and deterministic methods play their parts. In two separate posts, we explain our applications of AI (in the form of a neural network for short-form text classification) and under-utilized regular expressions (Regex) for handling the patterns in text.
Clients and partners have deployed Kubro(TM) in domains ranging from real estate to cryptocurrencies in the US and Asia. The system supports web-based sources, Twitter accounts, exchange filings, and emails as inputs, or any internal content.
We like to think of all this as software that helps an organization build their army of robo-analysts in a very customized yet efficient way.
But robo-analysts individually have the potential to go beyond pure automation - will share more on that in the last note in this 'series' next week – in Automation Temptation Part 3: Robo Analyst Jr.
Robert Ciemniak is the Founder-CEO of Robotic Online Intelligence Ltd (ROI) and Real Estate Foresight (REF).