Labor Hours Predictor
Industry: Construction
Dataset: csv file with the size of the job and the log(size) of the job as the input variables. The number of hours to complete the job is the target variable as noted by the asterisk (*) in the header row.
Predictor: The predictors created are regression algorithms (see all the predictors for this dataset). The best predictor is plain Linear Regression with an average error of 37% (test this predictor). This error is high which indicates that the data is noisy. The dumb error is the average error if just the average hours was used as the predictor.
Language Predictor
Industry: Web Developement
Dataset: A .zip file with folders containing text files from different languages (see picture below).
Predictor: The predictors created are classification algorithms like Support Vector Machines and Naive Bayes (see all the predictors for this dataset). Both the SVM and the Naive Bayse predictor was able to classify all the text documents correctly. You can test this predictor by copying and pasting some text into the text box on this page. The dumb error is the average error if just the most common category was used to predict the language of all the documents.


leave a comment