Most text data from diverse document databases are unsuitable for analytical methods based on statistics and machine learning algorithms. Patent documents are also compiled into text datasets. Similar to other document datasets, therefore, the authors need to transform patent documents into structured data for a statistical analysis. This transformation is performed using the preprocessing of text mining techniques. They can analyze the patent documents after their preprocessing. For a patent analysis, two phases, preprocessing and analysis, are required.