We have hosted the application tika python in order to run this application in our online workstations with Wine or directly.


Quick description about tika python:

A Python port of the Apache Tika library that makes Tika available using the Tika REST Server. This makes Apache Tika available as a Python library, installable via Setuptools, Pip and easy to install. To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found here) and set the TIKA_SERVER_JAR environment variable to TIKA_SERVER_JAR="file: ////tika-server.jar" which successfully tells python-tika to "download" this file and move it to /tmp/tika-server.jar and run as a background process. This is the only way to run python-tika without internet access. Without this set, the default is to check the tika version and pull latest every time from Apache.

Features:
  • Parser Interface (backwards compat prior to REST)
  • The parser interface extracts text and metadata using the /rmeta interface
  • Optionally, you can pass Tika server URL along with the call what's useful for multi-instance execution
  • Specify Output Format To XHTML
  • The unpack interface handles both metadata and text extraction in a single call
  • Internally returns back a tarball of metadata and text entries that is internally unpacked


Programming Language: Python.
Categories:
Text Processing, Healthcare, Machine Learning

Page navigation:

©2024. Winfy. All Rights Reserved.

By OD Group OU – Registry code: 1609791 -VAT number: EE102345621.