Sales forecasting is the process of estimating future sales and revenue in order to enable companies to make informed business decisions and predict short-term and long-term performance. Companies can base their forecasts on past sales data, industry-wide comparisons, and economic trends. The problem of sales forecasting can be classified as a time-series forecasting, because the time is the domain in which the data (sales or revenue) got changed.
Time Series AnalysisA time series is a sequence of data points ordered by the time. Time series analysis is a methodology for extracting useful and meaningful information from these data points. Any time series can be decomposed into three components:
- Trend: it means the regression of the data points with time. For example, a time series with a positive trend means that the values of the data points at (t+n) is larger than the ones at time (t). Here the value of the data dependes on the Time rather than the previous values.
- Seasonality (Cycle): it means the repetition of the data over the time domain. In other words, the data values at time (t+n) is the same as the data at time (t), where n is the seasonality or cycle length
- Noise (Random Walk): this is a time independent component (non-systematic) that is added (or subtracted) to the data points.
Based on this we can classify the time series into two classes:
- Non-Stationary: data points with means or variance and covariance that change over the time. This is interpreted as trends or cycles or combination of them.
- Stationary: data points that its means and variance and covariance does not change over the time
The theories behind non-stationary signals and forecasting is not mature and modeling it is complex, which leads to inaccurate results. Luckily, non-stationary series can be transformed into stationary using common techniques (e.g. differentiation). The idea of differentiation is to subtract the data value from its predecessor, so the new series will lose a component of the time-dep
Sales-Force ForecastingSalesforce.com (abbreviated as SF or SFDC) is a could computing company that purchase customer relationship management (CRM) products. Salesforce.com's CRM service is broken down into several broad categories: Sales Cloud, Service Cloud, Data Cloud, Marketing Cloud, Community Cloud, Analytics Cloud, App Cloud, and IoT with over 100,000 customers.
In this post we will analyze and forecast a sample sales data from salesforce CRM that shows the sales grows between 2013 and 2016, and we will predict the sales values for two business quarters. We will use two models for forecasting: ARIMA and HoltWinters, and will demonstrate how to do that using R language.
The methodology that we will follow is:
- Aggregate the data per month
- Construct the model using 75% of the data as training set
- Check the model accuracy using 25% of the data, and calculate the root mean square error
- Forecast the data for the next two quarters
- CloseDate: date of closing the oppertunity (Measure field)
- Amount: the monotary amount optained
- IsWon and IsClosed: flags for if the opportunity win/lost and closed/opened
Sales Forecasting using RWe need to install two packages: RJDBC for connecting to DB and retrieve the data, and forecast for data modeling, analysis and forecasting. The following R script shows the sales forecasting using ARMIA.
ARIMA model is an abbreviation for Autoregressive Integrated Moving Average, so it is a combination of multiple techniques:
- Auto-regression (AR)
- Integration (I)
- Moving Average (MA)
The script splits the data-set into training data (75%) and verification data (25%). Next, it build the model based on the training data. R has an implementation for ARIMA model featured with automatic detection of parameters. For the before mentioned SFDC dataset, we obtained the following model, with root mean square error = 233560 Next, we build a model using all the data-set, the obtained model is Finally, we forecast the next 6 months using this model at Line 45. The data forecasting is
R supports different time series forecasting models. In the code above, you can easily change the forecasting model by changing Line 18. Fore example to use HoltWinters model change the code to The root mean square error was 301992.9, and the predictions were in this case
As we see, using HoltWinters model for our data is more appropriate than ARMIA (less mean square error)