146 spaCy preprocessing
Setup of the (basic) preprocessing pipeline. Current inconsistencies in the rendered tokens:
- Words connected with a dash are not taken apart
- Words with diacritical marks that should not be there may appear
Setup of the (basic) preprocessing pipeline. Current inconsistencies in the rendered tokens: