Hotel Booking Analysis

  • Tech Stack: Python, Jupyter, NumPy, Pandas, matplotlib, Seaborn, Plotly
  • Github URL: Project Link
  1. Framing the questions: Before any form of analysis, it is important to frame the questions that we want to know from the data.

  2. In total there are 119,390 records and 32 features, with all of these features presenting almost (or none) null values, except for the variable "company" (94 % of records are missing).

  3. Regarding the prediction of cancellations, the model obtained an 85 % accuracy, 81 % precision, 77 % recall, and a 79 % f1-score.

  4. Cancellation percentage over bookings is 37 %. 27 % for the resort hotel and 41 % for the city hotel

  5. The month of highest occupation is August with 11.65% of the reservations. The month of least occupation is January with 4.94% of the reservations.