We have hosted the application dctfinder in order to run this application in our online workstations with Wine or directly.


Quick description about dctfinder:

Web pages do not offer reliable metadata concerning their creation date and time. However, getting the document creation time is a necessary step for allowing to apply temporal normalization systems to web pages. DCTFinder is a system that parses a web page and extracts from its content the title and the creation date of this web page. DCTFinder combines heuristic title detection, supervised learning with Conditional Random Fields (CRFs) for document date extraction, and rule-based creation time recognition.

DCTFinder is released under CeCILL free software license agreement.

The system is described in the following paper (see 'Files' section):
Xavier Tannier. "Extracting News Web Page Creation Time with DCTFinder". Proceedings of the 9th Language Resources and Evaluation Conference. Reykjavik, Iceland.

Audience: Information Technology, Science/Research.

Programming Language: Java.
Categories:
Information Analysis, Linguistics

Page navigation:

©2024. Winfy. All Rights Reserved.

By OD Group OU – Registry code: 1609791 -VAT number: EE102345621.