The data format supports tabular data, and the file formats support CSV (Comma Separated) and TSV (Tab Separated).
Each row corresponds to one sample (a single piece of data is called a sample. For example, in customer data, it refers to customers), and each column (variable) corresponds to an attribute of the sample (e.g., age, gender, etc.). The first row of the data file contains column names (Variable name), and the second and subsequent rows contain sample information. Each row must have the same number of variables. Use an empty string if there is a missing value. There are four data types available: string, text, numeric, and datetime. However, you can only specify text or numeric values for binary or multiclass classification, and numeric values for regression.
Easy Predictive Analytics reads the first 1000 row of the file to determine how many values for the variable you want to predict. If the variable to be predicted is a character string, binary or multiclass classification is determined according to the determined unique number. If you want to perform binary classification, make sure that the first 1000 row contains two values for the variable you want to predict. If you want to perform multiclass classification, sort the first 1000 row so that there are at least three possible values for the variable you want to predict. In addition, please note that the above categories 200 are not supported. For a specific example, see Detailed specifications.