Upload
rakesh9aug
View
223
Download
0
Embed Size (px)
Citation preview
8/4/2019 18 DeepWeb
http://slidepdf.com/reader/full/18-deepweb 1/17
Open Source Intelligence:
Presented by
Abe Lederman, President and CTO
Deep Web Technologies, LLC
IOP ¶06 Sheraton Premier, Tysons Corner, Virginia January 16-20
Access All Intelligence,
All Languages, All the Time
8/4/2019 18 DeepWeb
http://slidepdf.com/reader/full/18-deepweb 2/17
About Deep Web Technologies (DWT)
� Deployed first ³federated search´ portal in theFederal Government, 1999
� Major clients include: ± DOE Office of Scientific & Technical Information
± Defense Technical Information Center ± Science.gov Alliance
± DOE Office of Science
± National Agricultural Library
DWT is a New Mexico based company focused on providing
state-of-the-art software solutions which search, retrieve,
aggregate, and analyze content.
8/4/2019 18 DeepWeb
http://slidepdf.com/reader/full/18-deepweb 3/17
Open Source Intelligence
The Problem:
� Collecting and analyzing enormous
quantities of information in any language,in myriad formats, located anywhere,
accessible through a large variety of
means, with a majority not accessible
through the Internet
8/4/2019 18 DeepWeb
http://slidepdf.com/reader/full/18-deepweb 4/17
Shared Challenge:
OSINT and Knowledge Discovery/Diffusion
OSINT
ChallengesKnowledge
Discovery/
Diffusion
Challenges
DWT for the past six years has been the lead technical
organization addressing these challenges in collaboration
with DOE Office of Scientific & Technical Information
8/4/2019 18 DeepWeb
http://slidepdf.com/reader/full/18-deepweb 5/17
The DWT Proposition
To apply DWT¶s technology, expertise
and ongoing innovations* to addressthe challenges of OSINT
*Developed in partnership with DOE/OSTI
8/4/2019 18 DeepWeb
http://slidepdf.com/reader/full/18-deepweb 6/17
Challenges in Working with
Thousands of Data SourcesLocate Reliable Sources
Categorize Sources by Content
Configure Sources for Searching
Maintain Sources
8/4/2019 18 DeepWeb
http://slidepdf.com/reader/full/18-deepweb 7/17
Challenges in Searching
Thousands of Sources Automatically Select
Sources to Search
Perform Many Searchesin Parallel
Translate, Analyze and
Organize Results
Relevance
Rank
Cluster/
Visualize
Extract Key
Information
8/4/2019 18 DeepWeb
http://slidepdf.com/reader/full/18-deepweb 8/17
DWT¶s State-of-the-artFederated Search Engine
� Scalable, grid-computing based federatedsearch engine
� Sophisticated Search Conductor
� Supports custom connectors
� Multi-tier relevance ranking� Framework accepts integration of advanced
linguistic, analyses, and visualizationmodules
ResearchAssistantTM
8/4/2019 18 DeepWeb
http://slidepdf.com/reader/full/18-deepweb 9/17
Grid Computing:
Distributing the Workload
8/4/2019 18 DeepWeb
http://slidepdf.com/reader/full/18-deepweb 10/17
Search Conductor Select sources
to search
Perform search
Deliver results
to user
Can I get
more results
from ³good´
sources?
Enough
good
results?
YES
YES
NO
NO
8/4/2019 18 DeepWeb
http://slidepdf.com/reader/full/18-deepweb 11/17
Multi-tier Relevance Ranking
� QuickRankTM ± Ranks results based onoccurrence of search terms in title andsnippet
� MetaRankTM ± Ranks results utilizingcustom algorithms applied to metadata
� DeepRankTM ± Downloads and indexesfull-text documents
8/4/2019 18 DeepWeb
http://slidepdf.com/reader/full/18-deepweb 12/17
Science.gov Alliance Consortium of
12 Federal Government Agencies
Dept of Agriculture
Dept of Commerce
Dept of Defense
Dept of Education
Dept of EnergyDept of Health/Human Services
Dept of Interior
Environmental Protection Agency
NASA
National Science Foundation
US Government Printing OfficeNational Archives & Records
Administration
Sponsoring
Science.gov Portal
(Access to most of Federal Government R&D
8/4/2019 18 DeepWeb
http://slidepdf.com/reader/full/18-deepweb 13/17
Science.gov Advanced Search Page
8/4/2019 18 DeepWeb
http://slidepdf.com/reader/full/18-deepweb 16/17
Next Steps
Identify Sponsors and development
partners that can collaborate on the
development of a pilot that integrates best-
of-breed technologies of value to OSINT.
This pilot will result in a portal that
aggregates content of different types,
generating actionable intelligence.
8/4/2019 18 DeepWeb
http://slidepdf.com/reader/full/18-deepweb 17/17
Contact Us
Abe Lederman
122 Longview Drive
Los Alamos, NM 87544
www.deepwebtech.com
http://www.deepwebtech.com/talks/IOP.ppt