ESANN 2018 - Proceedings

26th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning

Paperback - In het Engels 48,00 €
PDF - In het Engels 32,00 €

InfoVoor meer informatie over BTW en andere belatingsmogelijkheden, zie hieronder "Betaling & BTW".

Gegevens


Uitgever
ESANN
Imprint
Presses universitaires Saint-Louis Bruxelles
Auteur
Collectif,
Taal
anglais
BISAC Subject Heading
TEC000000 TECHNOLOGY & ENGINEERING
BIC subject category (UK)
T Technology, engineering, agriculture
Voor het eerst gepubliceerd
22 maart 2018

Livre broché


Publicatie datum
01 januari 2007
ISBN-13
9782874630828
Omvang
Aantal pagina's hoofdinhoud : 182
Code
76399
Formaat
16 x 24 x 1 cm
Gewicht
510 grams
Aanbevolen verkoopprijs
19,70 €
ONIX XML
Version 2.1, Version 3

PDF


Publicatie datum
01 januari 2007
ISBN-13
9782874635045
Omvang
Aantal pagina's hoofdinhoud : 182
Code
76399PDF
ONIX XML
Version 2.1, Version 3

Google Book Preview


Schrijf een reactie

Inhoud


Table of Contents .................................................................................................... vii

Preface ..................................................................................................................... 1

WAC3 ..................................................................................................................... 3

Kevin P. SCANNELL, The Crúbadán Project: Corpus building for underresourced

languages ..........................................................................................5

Sebastian BLOHM, Philipp CIMIANO, A Human Evaluation of Filtering

Functions for Pattern-based Extraction of Arbitrary Relations from the

Web .....................................................................................................................17

Emmanuel CARTIER, TextBox, a Written Corpus Tool for Linguistic Analysis ...... 33

William H. FLETCHER, Implementing a BNC-Compare-able Web Corpus ............ 43

Fabrice ISSAC, Yet Another Web Crawler ................................................................ 57

Igor LETURIA, Antton GURRUTXAGA, Iñaki ALEGRIA, Aitzol EZEIZA, CorpEus,

a 'web as corpus' tool designed for the agglutinative nature of Basque ...........69

Serge SHAROFF, Classifying Web corpora into domain and genre using

automatic feature identification .........................................................................83

Anil Kumar SINGH, Jagadeesh GORLA, Identification of Languages and

Encodings in a Multilingual Document ............................................................. 95

CLEANEVAL .......................................................................................................... 109

Daniel BAUER, Judith DEGEN, Xiaoye DENG, Priska HERGER, Jan GASTHAUS,

Eugenie GIESBRECHT, Lina JANSEN, Christin KALINA, Thorben KRÜGER,

Robert MÄRTIN, Martin SCHMIDT, Simon SCHOLLER, Johannes STEGER,

Egon STEMLE, Stefan EVERT, FIASCO: Filtering the Internet by Automatic

Subtree Classification, Osnabrück ..................................................................... 111

Stefan EVERT, StupidOS: A high-precision approach to boilerplate removal ........ 123

Weizheng GAO, Tony ABOU-ASSALEH, GenieKnows Web Page Cleaning

System ................................................................................................................. 135

Christian GIRARDI, Htmcleaner: Extracting the Relevant Text from the Web Pages ..... 141

Katja HOFMANN, Wouter WEERKAMP, Web Corpus Cleaning using Content

and Structure ...................................................................................................... 145

Michal MAREK, Pavel PECINA, Miroslav SPOUSTA, Web Page Cleaning with

Conditional Random Fields ............................................................................... 155

Xabier SARALEGI, Igor LETURIA, Kimatu, a tool for cleaning non-content text

parts from HTML docs ....................................................................................... 163