References
Buuren, S. van. 2012. Flexible Imputation of Missing Data.
Chapman & Hall/CRC Interdisciplinary Statistics. CRC Press. https://books.google.com/books?id=elDNBQAAQBAJ.
Galli, S. 2020. Python Feature Engineering Cookbook: Over 70 Recipes
for Creating, Engineering, and Transforming Features to Build Machine
Learning Models. Packt Publishing. https://books.google.com/books?id=2c_LDwAAQBAJ.
GeΜron, AureΜlien. 2017. Hands-on Machine Learning with Scikit-Learn
and TensorFlow : Concepts, Tools, and Techniques to Build Intelligent
Systems. Sebastopol, CA: OβReilly Media.
Honnibal, Matthew, Ines Montani, Sofie Van Landeghem, and Adriane Boyd.
2020. βspaCy: Industrial-strength Natural
Language Processing in Python.β https://doi.org/10.5281/zenodo.1212303.
Kuhn, M., and K. Johnson. 2013. Applied Predictive Modeling.
SpringerLink : BΓΌcher. Springer New York. https://books.google.com/books?id=xYRDAAAAQBAJ.
βββ. 2019. Feature Engineering and Selection: A Practical Approach
for Predictive Models. Chapman & Hall/CRC Data Science Series.
CRC Press. https://books.google.com/books?id=q5alDwAAQBAJ.
Kuhn, M., and J. Silge. 2022. Tidy Modeling with r. OβReilly
Media. https://books.google.com/books?id=98J6EAAAQBAJ.
Lewis, David D., Yiming Yang, Tony G. Rose, and Fan Li. 2004.
βRCV1: A New Benchmark Collection for Text
Categorization Research.β Journal of Machine Learning
Research 5: 361β97. https://www.jmlr.org/papers/volume5/lewis04a/lewis04a.pdf.
Luhn, H. P. 1960. βKey Word-in-Context Index for Technical
Literature (Kwic Index).β American Documentation 11 (4):
288β95. https://doi.org/https://doi.org/10.1002/asi.5090110403.
Micci-Barreca, Daniele. 2001. βA Preprocessing Scheme for
High-Cardinality Categorical Attributes in Classification and Prediction
Problems.β SIGKDD Explor. Newsl. 3 (1): 27β32. https://doi.org/10.1145/507533.507538.
Nothman, Joel, Hanmin Qin, and Roman Yurchak. 2018. βStop Word
Lists in Free Open-Source Software Packages.β In Proceedings
of Workshop for NLP Open Source Software
(NLP-OSS), edited by Eunjeong L. Park,
Masato Hagiwara, Dmitrijs Milajevs, and Liling Tan, 7β12. Melbourne,
Australia: Association for Computational Linguistics. https://doi.org/10.18653/v1/W18-2502.
Ozdemir, S. 2022. Feature Engineering Bookcamp. Manning. https://books.google.com/books?id=3n6HEAAAQBAJ.
Porter, Martin F. 1980. βAn Algorithm for Suffix
Stripping.β Program 14 (3): 130β37. https://doi.org/10.1108/eb046814.
βββ. 2001. βSnowball: A Language for Stemming Algorithms.β
https://snowballstem.org.
Robertson, Stephen. 2004. βUnderstanding Inverse Document
Frequency: On Theoretical Arguments for IDF.β Journal of
Documentation 60 (5): 503β20.
RUBIN, DONALD B. 1976. βInference and missing
data.β Biometrika 63 (3): 581β92. https://doi.org/10.1093/biomet/63.3.581.
SPARCK JONES, K. 1972. βA STATISTICAL INTERPRETATION OF TERM
SPECIFICITY AND ITS APPLICATION IN RETRIEVAL.β Journal of
Documentation 28 (1): 11β21. https://doi.org/https://doi.org/10.1108/eb026526.
Thakur, A. 2020. Approaching (Almost) Any Machine Learning
Problem. Amazon Digital Services LLC - Kdp. https://books.google.com/books?id=ZbgAEAAAQBAJ.