CallMeMaybe¶

Project Description¶

The virtual telephony service CallMeMaybe (call center) is developing a new function that will give supervisors information on the least effective operators. An operator is considered ineffective if they have a large number of missed incoming calls (internal and external) and a long waiting time for incoming calls. Moreover, if an operator is supposed to make outgoing calls, a small number of them is also a sign of ineffectiveness.

The datasets contain data on the use of the virtual telephony service CallMeMaybe. Its clients are organizations that need to distribute large numbers of incoming calls among various operators or make outgoing calls through their operators. Operators can also make internal calls to communicate with one another. These calls go through CallMeMaybe's network.

Executive summary¶

Link to presentation: https://drive.google.com/file/d/1EXTnZYiKVr9UPSqS_IO8U_-A0boTII16/view?usp=sharing

Link to dashboard: https://public.tableau.com/profile/alina7324#!/vizhome/CallMeMaybe_16106288661390/CallMeMaybedashboard

Task: Determine the thresholds for effective operator performance according to KPIs

Recommendations:

Effective operators:
- Answer calls in 25 seconds
- Make more than 34 outgoing calls each day
- Get 25 answered calls each day
Ineffective operators:
- Take longer than 25 seconds to answer an incoming call
- Make less than 34 outgoing calls per day
- Get less than 25 answered calls per day
Check missed calls (abandonment rate) on the company level and not on operator level.
Each operator should only make one type of call (incoming or outgoing). This will decrease the call abandonment rate.
Put emphasis on outgoing calls.
Put emphasis on improving the operators working with clients with tariff plans B and C.

Tables Of Contents¶

Opening data
Preprocessing
- Calls datset
- Clients datase
Exploratory data analysis
Determine effectiveness using clustering
- Incoming calls
- Outgoing calls
Testing statistical hypothesis
- Difference between missed calls rate for incoming calls
- Do tariff plans effect quality of service for outgoing calls
Conclusion
Resources

Opening data ¶

import pandas as pd 
import numpy as np
from numpy import median
from scipy import stats as st
import math as mth

import datetime as dt
from datetime import datetime

import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import matplotlib.gridspec as gs

import seaborn as sns

import plotly
import plotly.express as px
from plotly import graph_objects as go
import plotly.io as pio
pio.templates.default = "seaborn"

from scipy.cluster.hierarchy import dendrogram, linkage
from sklearn.cluster import *

import sys
import warnings
if not sys.warnoptions:    
       warnings.simplefilter("ignore")

#import the files

location = '/some/local/path'

try: 
    calls = pd.read_csv(location + 'telecom_dataset_us.csv')
    clients = pd.read_csv(location + 'telecom_clients_us.csv')
except: 
    calls = pd.read_csv('/datasets/telecom_dataset_us.csv')
    clients = pd.read_csv('/datasets/telecom_clients_us.csv')

	user_id	date	direction	internal	operator_id	is_missed_call	calls_count	call_duration	total_call_duration
0	166377	2019-08-04 00:00:00+03:00	in	False	NaN	True	2	0	4
1	166377	2019-08-05 00:00:00+03:00	out	True	880022.0	True	3	0	5
2	166377	2019-08-05 00:00:00+03:00	out	True	880020.0	True	1	0	1
3	166377	2019-08-05 00:00:00+03:00	out	True	880020.0	False	1	10	18
4	166377	2019-08-05 00:00:00+03:00	out	False	880022.0	True	3	0	25

	user_id	date	direction	internal	operator_id	is_missed_call	calls_count	call_duration	total_call_duration
53897	168606	2019-11-10 00:00:00+03:00	out	True	957922.0	True	1	0	38
53898	168606	2019-11-11 00:00:00+03:00	out	True	957922.0	False	2	479	501
53899	168606	2019-11-15 00:00:00+03:00	out	True	957922.0	False	4	3130	3190
53900	168606	2019-11-15 00:00:00+03:00	out	True	957922.0	False	4	3130	3190
53901	168606	2019-11-19 00:00:00+03:00	in	False	NaN	True	2	0	64

	user_id	date	direction	internal	operator_id	is_missed_call	calls_count	total_call_duration
35215	167521	2019-11-24 00:00:00+03:00	in	False	NaN	True	18	471
7782	166636	2019-10-15 00:00:00+03:00	in	False	NaN	True	1	38
31004	167359	2019-11-23 00:00:00+03:00	in	False	NaN	True	1	56
35673	167532	2019-11-28 00:00:00+03:00	in	False	NaN	True	6	83
43908	168021	2019-10-20 00:00:00+03:00	in	False	NaN	True	2	0
13874	166803	2019-11-21 00:00:00+03:00	in	False	NaN	True	1	25
33015	167479	2019-10-23 00:00:00+03:00	in	False	NaN	True	1	6
26920	167158	2019-11-22 00:00:00+03:00	in	False	NaN	True	7	15
16028	166908	2019-08-27 00:00:00+03:00	in	False	NaN	True	2	30
41477	167870	2019-11-16 00:00:00+03:00	in	False	NaN	True	1	86

	user_id	date	direction	internal	operator_id	is_missed_call	calls_count	call_duration	total_call_duration
1663	166406	2019-09-02 00:00:00+03:00	in	NaN	879898.0	False	1	2	9
5302	166541	2019-09-26 00:00:00+03:00	in	NaN	908960.0	False	1	393	423
5306	166541	2019-09-26 00:00:00+03:00	in	NaN	908958.0	False	2	547	612
6359	166604	2019-11-01 00:00:00+03:00	in	NaN	893402.0	False	1	94	117
7339	166658	2019-09-24 00:00:00+03:00	in	NaN	890404.0	False	1	150	157
7748	166658	2019-10-15 00:00:00+03:00	in	NaN	890404.0	False	1	51	57
13415	166916	2019-10-01 00:00:00+03:00	in	NaN	906396.0	False	1	100	117
13531	166916	2019-10-07 00:00:00+03:00	in	NaN	906406.0	False	3	378	461
13781	166916	2019-10-23 00:00:00+03:00	in	NaN	906400.0	False	1	81	110
15558	166983	2019-09-02 00:00:00+03:00	in	NaN	901880.0	False	1	119	127

	user_id	date	direction	internal	operator_id	is_missed_call	calls_count	call_duration	total_call_duration
0	166377	2019-08-05	out	True	880022	True	3	0	5
1	166377	2019-08-05	out	True	880020	True	1	0	1
2	166377	2019-08-05	out	True	880020	False	1	10	18
3	166377	2019-08-05	out	False	880022	True	3	0	25
4	166377	2019-08-05	out	False	880020	False	2	3	29

	user_id	date	direction	internal	operator_id	is_missed_call	calls_count	call_duration	total_call_duration
17639	167016	2019-11-06	in	False	916424	False	27	2306	2557
1002	166405	2019-10-14	out	False	882684	False	1	46	61
18228	167035	2019-11-18	in	False	923526	False	33	2152	2398
6177	166582	2019-10-29	in	False	885890	False	17	613	778
32105	167650	2019-10-01	out	False	921318	True	18	0	497
26620	167445	2019-11-10	out	False	920726	False	2	156	170
15715	166983	2019-09-18	in	False	901884	False	7	1477	1589
510	166377	2019-11-25	in	False	880028	False	1	89	105
42040	168187	2019-11-26	out	False	937966	True	8	0	100
21113	167125	2019-10-01	out	True	902778	True	2	0	8
13372	166916	2019-09-27	in	False	906404	False	26	2976	3986
43921	168336	2019-11-21	out	False	948758	True	15	0	318
25221	167285	2019-10-11	out	True	908640	False	1	284	287
37340	168047	2019-10-30	out	False	937604	False	1	4	4
24495	167199	2019-11-17	in	False	911310	False	1	18	24
8794	166678	2019-08-31	in	False	888868	False	1	34	42
39857	168187	2019-10-18	out	False	937808	True	9	0	288
43347	168253	2019-11-18	out	True	952948	True	2	0	0
22697	167172	2019-10-29	out	False	926490	False	1	756	768
32898	167654	2019-11-13	out	False	918986	True	12	0	357

	user_id	date	direction	internal	operator_id	is_missed_call	calls_count	call_duration	total_call_duration
5780	166582	2019-09-11	in	False	885890	True	1	1	27
36344	167977	2019-11-19	in	False	944226	True	2	66	170
42187	168187	2019-11-28	in	False	937812	True	1	61	69
5231	166541	2019-09-13	in	False	908958	True	1	1	21
27179	167466	2019-10-21	in	False	921818	True	1	1	6

	count	mean	std	min	25%	50%	75%	max
user_id	41546.0	167301.311992	600.418838	166377.0	166782.0	167175.0	167827.0	168606.0
operator_id	41546.0	916523.315409	21230.041008	879896.0	900790.5	913938.0	937708.0	973286.0
calls_count	41546.0	16.900424	59.749373	1.0	1.0	4.0	13.0	4817.0
call_duration	41546.0	1009.769172	4064.106117	0.0	0.0	106.0	770.0	144395.0
total_call_duration	41546.0	1321.592813	4785.978633	0.0	67.0	288.0	1104.0	166155.0

	count	unique	top	freq
direction	41546	2	out	28813
internal	41546	2	False	36161
is_missed_call	41546	2	False	27732

	user_id	tariff_plan	date_start
0	166713	A	2019-08-15
1	166901	A	2019-08-23
2	168527	A	2019-10-29
3	167097	A	2019-09-01
4	168193	A	2019-10-16

	user_id	tariff_plan	date_start
727	166554	B	2019-08-08
728	166911	B	2019-08-23
729	167012	B	2019-08-28
730	166867	B	2019-08-22
731	166565	B	2019-08-08

	operator_id	user_id
0	879896	1
1	937966	1
2	938072	1
3	938070	1
4	938022	1

	user_id	operator_id
0	168187	50
1	167626	48
2	167497	30
3	168252	28
4	168361	27
5	168062	27
6	166680	21
7	166520	18
8	166658	17
9	166916	16
10	168091	16
11	167176	15
12	167188	15
13	167580	15
14	168336	14
15	167521	12
16	168073	11
17	167445	11
18	166405	10
19	167828	9

	date	operator_id	total_call_duration
0	2019-09-25	885876	189989
1	2019-09-26	885876	172165
2	2019-09-09	885876	160826
3	2019-09-18	885876	159178
4	2019-10-02	885876	155402

	user_id	join_date	tariff_plan	date	direction	internal	operator_id	is_missed_call	calls_count	call_duration	total_call_duration
0	166377	2019-08-01	B	2019-08-05	out	True	880022	True	3	0	5
1	166377	2019-08-01	B	2019-08-05	out	True	880020	True	1	0	1
2	166377	2019-08-01	B	2019-08-05	out	True	880020	False	1	10	18
3	166377	2019-08-01	B	2019-08-05	out	False	880022	True	3	0	25
4	166377	2019-08-01	B	2019-08-05	out	False	880020	False	2	3	29

	index	user_id	join_date	tariff_plan	date	direction	internal	operator_id	is_missed_call	calls_count	call_duration	total_call_duration	waiting_duration	avg_wait_time
21	21	166377	2019-08-01	B	2019-08-12	in	False	880028	False	1	407	411	4	4.0
26	26	166377	2019-08-01	B	2019-08-13	in	False	880028	False	1	88	102	14	14.0
29	29	166377	2019-08-01	B	2019-08-14	in	False	880026	False	2	197	218	21	10.5
30	30	166377	2019-08-01	B	2019-08-14	in	False	880028	False	1	33	37	4	4.0
38	38	166377	2019-08-01	B	2019-08-15	in	False	880028	False	1	23	27	4	4.0

	avg_miss_call	wait_time
operator_id
879896	0.0	13.396199
879898	0.0	14.080117
880020	0.0	7.714286
880022	0.0	14.000000
880026	0.0	5.935185

	avg_call_count	avg_answer_call
operator_id
879896	9.083333	6.468750
879898	42.405882	28.523529
880020	2.923077	1.230769
880022	3.048387	1.354839
880026	13.463415	9.512195

CallMeMaybe¶

Project Description¶

Executive summary¶

Tables Of Contents¶

Opening data ¶

Preprocessing ¶

Calls ¶

Clients ¶

Final touches:¶

Preprocessing stage summary:¶

Exploratory data analysis ¶

Find how many users and operators are in the log ¶

Invastigate the proportions of incoming and outgoing calls ¶

Invastigate the proportions of internal and external calls ¶

Investigate the distribution of entries over time and determaine whether all the data should be used for analysis ¶

Find Outliers for call duration ¶

Calculate waiting time and average waiting time for each entry ¶

Determine effectiveness using clustering ¶

Incoming calls ¶

Outgoing calls ¶

Testing statistical hypothesis ¶

Difference between missed calls rate for incoming calls ¶

Do tariff plans effect quality of service for outgoing calls ¶

Conclusions ¶

Summary:¶

Conclusions:¶

Recommendations:¶

Resources ¶