- move preprocess.py from loader/ to core/
- changes to interface of preprocess: 1. add run method, to run the main processing 2. add cross validation split 3. add return value 4. merge subclasses
- Trainer supports cross validation
- add data as arguments in Trainer.train & Tester.test
- add readme.example.py, to run the example program shown in README.md
- other corresponding changes
- see fastNLp/saver/logger.py to know how to create and use a logger
- a log file named "train_test.log" will be created in the same dir as the main file where the program starts
- this file records all important events happened in Trainer & Tester's methods
- rename Inference to Predictor
- rename Trainer.prepare_input to Trainer.load_train_data, load data_train.pkl only
- add __contains__ method to config Section class
- more code comments
- more elegant make_batch & data_iterator: Samplers return batch samples instead of batch indices
- rename "POSTrainer", "POSTester" to "SeqLabelTrainer", "SeqLabelTester"
- Trainer & Tester have NO relation with Action
- Inference owns independent "make_batch" & "data_forward"
- Conversion to Tensor & go into cuda are done in "make_batch"
- "make_batch" support maximum/minimum length
- change parameter <seq_length-->mask> in loss function defined in seq model
- Trainer & Tester have Action as default parameter, shared static methods like make_batch
- add seq_len in make_batch of Inference
- add SeqLabelInfer, a subclass of Inference
- seq_labeling.py works
- Action collects shared operations: data_forward, mode, pad, make_batch
- Trainer and Tester receives Action as a parameter
- seq_labeling works in such setting
- [action] add k-means bucketing, partition sequences into buckets of nearly the same length
- [trainer] print train loss every 10 steps
- [loader] cws pku loader split sequence longer than max_seq_len into several shorter sequences