docwire doctotext online with Winfy

We have hosted the application docwire doctotext in order to run this application in our online workstations with Wine or directly.


Quick description about docwire doctotext:

DocWire's DocToText - A multifaceted, data extraction software development toolkit that converts all sorts of files to plain text and html. Written in C++, this data extraction tool has a parser able to convert PST & OST files along with a brand new API for better file processing.
To enhance its utility, DocToText, as a data extraction tool, can be integrated with other data mining and data analytics applications. It comes equipped with a high grade, scriptable and trainable OCR that has LSTM neural networks based character recognition. This document parser is able to extract metadata along with annotations and supports a list of formats that include: DOC, XLS, XLSB, PPT, RTF, ODF (ODT, ODS, ODP),
OOXML (DOCX, XLSX, PPTX), iWork (PAGES, NUMBERS, KEYNOTE),
ODFXML (FODP, FODS, FODT), PDF, EML, HTML, Outlook (PST, OST),
Image (JPG, JPEG, JFIF, BMP, PNM, PNG, TIFF, WEBP)

Available under GNU General Public License version 2.0 (GPLv2) and commercial licensing.

Features:
  • Ability to extract/import and export text, images, formatting and metadata along with annotations
  • Data can be transformed between import and export (filtering, aggregating etc)
  • Equipped with multiple importers
  • Equipped with multiple exporters
  • Equipped with a high grade, scriptable and trainable OCR that has LSTM neural networks based character recognition
  • Incremental parsing returning data as soon as they are available
  • Cross platform: Linux, Windows, MacOSX (and more is coming)
  • Can be embeded in your application (SDK)
  • Can be integrated with other data mining and data analytics applications
  • Parsing process can be easily designed by connecting objects with pipe | operator into a chain
  • Parsing chain elements communication based on Boost Signals
  • Custom parsing chain elements can be added (importers, transformers, exporters)
  • Small binaries, fast native C++ code


Audience: Advanced End Users, Developers, End Users/Desktop.
User interface: Command-line.
Programming Language: C++, C.
Categories:
Text Processing, Libraries, Data Recovery, OCR, Data Analytics

©2024. Winfy. All Rights Reserved.

By OD Group OU – Registry code: 1609791 -VAT number: EE102345621.