Conclusion
Five types of violations were selected for determine which category tend to have the strongest correlation with regards to foodborne illness incidents. The result shows that food sources and restaurant's facility design and maintenance may have the relatively stronger impacts (accounts for more than 50% feature importance). Food sources violations include unknown origins, present of unpasteurized source, and water potability. Facility design and maintenance violations involve lack of on-site sanitation and protection structure and utensils as well as evidences of lack of hygiene maintenance. However, considering the moderate accuracy and very high sensitivity of the model, violation inspection results can be used as reference for identifying potential contributors of food poisoning but not direct cause.
As a result, restaurant inspector might need to stress more on regulating the food sources in the food vending industry. At the meantime, when it comes of building permit process, restaurant safety specialist should contribute their opinions into validating a design of a new restaurant.
Weaknesses and Issues
During the data processing procedures, it was found that the incidents locations were not recorded with very accurate location. Therefore, BBL was selected as the main location merge method. However, in reality, there are more than one restaurant can be found under the same BBL. Therefore, the result accuracy is compromised to a certain extend.
In the meantime, the random forest model demonstrate a high sensitivity when it comes to predicting incidents which lead to many non-incidents records were predicted as incidents. The way how data was processed and recorded for fitting the model might need certain reconsiderations. Meanwhile, this can also be due to the selection of the training set.
Future Work Potential
Concerning the inspection result, a column is listed whether the inspection violation results are critical or non-critical. Potential works can be considered to separate critical results and non-critical results. Similar methodology can be applied to test the two groups. However, the Inspection Procedures does not demonstrate the division between critical and non-critical. Therefore, more clarification is need to explain whether violation score or condition level would affect the criticalness. In addition, due to different types of cuisine involves various cook methods, a further study can look into if any particular cuisine demonstrate higher foodborne illness incidents. The results from this potential study might be helpful to improve the standards of violation code.
Reference:
1. Firestone M.J., Hedberg C.W. (2018). Restaurant inspection letter grades and Salmonella infections. Emerging Infectious Disease, 24(12). DOI: 10.3201/eid2412.180544
2. Jones, T. F., Pavlin, B. I., LaFleur, B. J., Ingram, L. A., & Schaffner, W. (2004). Restaurant inspection scores and foodborne disease. Emerging Infectious Diseases, 10(4), 688.