Improve the Billing Assurance
Usually I write my blog articles in German but I guess that this article here will not be interesting for most of the people who are usually following my blog and therefore I decided to write this article in English so that a wider audience can read it.
The second reason why the article is not typical for my blog is that I guess it will be a really long article. Normally my articles are shorter but as this is a huge area, I’ll need some more words to illustrate my idea.
I’m working now for more then 11 years in the billing area of an international telco and internet carrier and I’ve seen various areas of billing during the time. I started in the area which was responsible for the leased line billing (back in the days the areas where separated to products) and I maintained the bill run. Later I moved to the project team of the billing department and among other things I’ve migrated various different billing systems to one EMEA (Europe, the Middle East and Africa) centralized billing system which now will be used for the APAC region as well as we also migrated the APAC (Asia-Pacific) systems to the one and only solution.
During my time within the operations team, we always focused on the billing accuracy which means that we proved each time the invoices. As I’m currently again looking to the operational tasks (to write a tutorial and to be able to act as a backup), I can see that this is still the case and I guess there is a mayor area of potential improvements.
Therefore I wrote this article here to summarize my ideas and maybe to also give others a chance to think again about their Billing Assurance Processes.
Maybe I also will have a chance to implement some of these improvements later as part of my Six Sigma Green Belt project. But we will see.
For sure it is essential for each company to ensure that the invoices which they send to their customers are correct and that no items are charged wrongly or are missing on an invoice. Especially for our company it is important as we have to be SOX (Sarbanes-Oxley Act) compliant. In my point of view the problem with SOX is that it can not describe specific rules. SOX was enacted as a reaction to a number of major corporate and accounting scandals. These scandals, which cost investors billions of dollars when the share prices of affected companies collapsed, shook public confidence in the nation’s securities markets. So the main topic for SOX is to ensure that there is no fraud. But what exactly is the definition of „fraud“ in this context? I think it’s difficult to explain and each company will have their own definition.
But no matter what the definition is, they all will try to ensure correct invoices.
As a high-level view I would say that mostly the focus is on three main areas for the controls:
1. system issues
The focus in this area is to ensure that the billing system works as expected. You may now ask yourself why this check is an essential check. That’s easy to explain. Normally the billing system is not a static system and you normally will have a lot of development in the system. It could be that new tariffs and services are implemented, it could be that you need to fix an existing and identified bug or you just changed the look and feel of the invoice layout. All these things could have a potential effect to the existing system and therefore it makes sense to check that the system still works as expected and that within the updates no side effects are appearing.
2. human errors
Within this area you try to identify potential human errors. Human errors are relatively difficult to find as they are usually appear onetime, for examples typos. When you see a human error which appears again and again, this typically means that you’ve found a training issue. This should be directly addressed so that the training could be updated.
3. errors of usage-processing
Regarding the usage-processing-checks you try to check that the provided and measured usage was charged correctly and the measure mechanism was correct. I’m talking here about a billing system for telco and internet carriers which means that you have rated CDRs (Call Details Records) and unrated EDRs (Event Detail Records). So you will check that the rating of the CDRs was correct, that no CDRs are missing and that no CDRs are suspended. For the EDRs you check that no EDRs are suspended and that they will be charged and rated correctly within the billing system.
Currently I would say that most of the companies will focus on item 1 and 3. The reason for this could be that normally this two areas have the biggest impact. If the system is not working correct, that would mean that potentially all customers are affected by this and if the usage is not processed correctly it could mean that all customers with the same usage type are affected by this. The last area, the human errors, usually only affect specific customers. Furthermore the checks of the system and usage are mostly the most time-consuming items and therefore you normally only have less time to search for human errors.
My idea to improve the billing assurance focuses on the first topic, the system checks.
Check the system
As I explained it’s not atypical that you will have a lot of updates/changes/implementations on a billing platform. There are various reasons for this and in my point of view the first problem will appear during the software development and software deployment. I would say that in an ideal world the software deployment should follow the following steps:
1. System Test
Within the system test normally you will check that the system is still working. It will be checked that the update/change will not have any side effects and that the system is still working. Furthermore you will check within the system test that the update/change provide an output independent from the output itself. You normally do not focus on the results as this will be checked in a later stage.
UAT stands for User Acceptance Test which means that the user should check the results of the update. To do an accurate UAT you need two important things. First the specifications which are used to develop the update/change and second some meaningful test data. When both have been defined/provided, you can process the test data within the UAT system and then can provide the results to the stakeholders. Now you can compare the results with the specifications which you provided for the development. The focus here will only be on the update/change.
3. Dry Run
Within the Dry Run you normally should check two items in parallel. First you check the same data which you have checked in the UAT and second you also check the effect on existing customers. The focus here is different from the UAT as you also try to identify side effects on existing customers. It is important that the Dry Run will be done in the production system (for sure you should define save points and make a backup before implementing the changes/updates).
Ideally this should be the easiest and fastest check within the deployment stages as you normally only have to check that the package was implemented correct in production. This check can be done based on spot checks.
For sure to follow this, there are some important suppositions which need to be fulfilled:
a.) different test systems
For each stage you need a separate test instance and ideally you have these instances on completely separate systems (servers) installed. It is also possible to do the different stages on the same server but then you need to encapsulate the test systems from the production environment.
b.) equal environments
As you need various instances/systems for the test stages, it is absolutely essential that all instances/systems have the same base which means that you only can test one iteration of each update/change in the systems. It makes no sense that when one stage has been passed, you directly implement the next update/change before the previous update/change has passed all test stages. All reference data must be the same in all systems.
c.) clearly defined specifications
For the development and also for the later testing it’s essential that all requirements are specified clearly and detailed. It is a risk that a developer „interprets“ the requirements when they are not clearly defined. These interpretations could lead into a wrong development.
d.) clearly defined exit and entry criteria
For each stage the exit and entry criteria needs to be defined. If the criteria is not complied, it’s important that you will start the whole process again from the beginning. It is not allowed to implement a fix of an identified bug only in one instance. When a bug was found which requires a change, then the process needs to be restarted from the first stage.
Unfortunately the experiences which I made in the past show that mostly the time for the tests are limited. More or less the development process will look like this:
In a „standard development cycle“ it is usual that the start and the end of the project is defined. Mostly these start and end dates are fix and can only be postponed with really good reasons. IMHO this is one of the issues which we have in a project management environment, but this is a completely different story. The only problem which occurs here in our area is that „testing“ is one of the latest steps within the development cycle and this is mostly the area where we have to cut the duration as the previous steps needed longer then calculated. As a result we only can have a quick check which gives us a good feeling regarding the update/change, but we can not do a full testing within the given time frame.
Reasons for issues
When we now sight to the standard development cycle and we have in mind that mostly the time for doing the tests is limited, we can understand where the issues are coming from. But there are also some more reasons, which I try to list here at a really high level:
-> limited test time
Within the limited slot it wasn’t possible to test all possible scenarios. On the one hand it could be that not all potential cases are defined and on the other hand it could be that not all results could be reviewed properly.
-> implementation without test
In some cases urgent fixes are identified which are implemented in the system without a full test (quick fixing). Mostly this quick fixes are only checked regarding their results (means that it will only be checked if the fix really fixes an identified issue). This quick fixes can cause side effects which are not found during the check as the focus was only to the result.
-> old issues
In some rare cases it could be that an issues exists for a long time in the system but never was identified before. This could be the case when the issue only appears in a really rare scenario.
All these potential reasons make it rational to check the system during each standard production bill run to ensure the highest quality.
How do you check
For sure there may be various mechanisms how a system can be checked and I’m not sure if other companies handle it in other ways than we do, but my experience shows me that the following mechanism was used in the past and was rated as really accurate.
- You define a test for each scenario
- You have a look to the invoice and look that everything is fine
- You have a look to the output files (i.e. reports) and check that everything is fine
To give you an idea of potential items which should be checked, you should have a look to the following:
|invoice reference(s)||Is the mechanism to display references on the invoice working correct and are the references shown at the correct place?|
|invoice footer||If you have more than one OpCo (Operating Company) implemented in your system, you should check that for all OpCos the correct invoice footers are used (correct legal entity etc.)|
|bank details on invoice||Same as before for the OpCos, you need to verify that the correct bank details are shown on the invoices|
|exchange rate(s)||If your billing system can handle more than one currency then you need to validate the exchange rates and the correct calculation|
|VAT rate(s)||Each OpCo will have its own VAT rate and therefore it should be checked that the billing system is using the correct rates and that the calculation is correct|
|Minimum-Invoice-Threshold||In some cases it’s possible to define a minimum threshold of the invoice amount in the billing system. When the invoice amount is below this threshold the billing system should suppress the invoice. Here you need to check that only invoices are suppressed which are below the defined threshold and that the invoices are generated when next month the old/suppressed amount plus the new amount are higher then the threshold|
|On-Hold mechanism||If your billing system can handle on-hold accounts (an account will be excluded from the current bill run), you need to check that this mechanism is working correct|
|Minimum-Spend mechanism||If your billing system support a minimum spend commitment then you need to check that this is working correct. You should check a customer which is below the minimum spend, one customer which is above the minimum spend and one customer which exactly have the minimum spend|
|VAT exemption||If a customer is VAT exempt a special VAT exempt sentence needs to be shown on the invoice and the VAT shouldn’t be charged/calculated|
|dispatch option(s)||If your billing system is supporting different invoice dispatch option then each option needs to be checked|
|invoice language(s)||If your billing system is supporting different invoice languages then each language needs to be checked regarding the invoice layout|
For sure these are only a handful of possible checks which need to be done but this will give you an idea how complicate and how comprehensive these checks are. I will give you a short example of the complexity:
for example we have
this would mean
|Curr.||sum of invoice
|bank details on invoice||1||5||3||3||4||180|
As you can see in the optimal case you have to check 3.060 invoices! For sure you can combine some checks for example the check of the invoice footer and the check of the bank details. But it also could be that you would like to partition the workload within various teams and therefore it could be that all 3.060 invoices are required. When we now assume that each check will at least need 5 minutes then we can say that all the checks require 15.300 minutes = 255 hours.
And this will also increase exponentially when you implement a new OpCo or an additional currency.
This is really a huge expenditure in time and people!
And it also should be considered that this are only some of the potential checks and that we didn’t focused directly on the correct usage billing. Normally we also can say that we have some more multiplier here (i.e. products, service type, usage types etc.)
The Improvement Idea
Normally the checks will be done manually which means that the person who’s doing the check is picking up an invoice and check this against the requirements. In my point of view this task, especially to check the system accuracy, can be automated. The only thing which you need to do is to define some sample patterns. For each scenario which you would like to check, you should define an example and this example should be processed by the billing system. The output (invoice, CSV files or whatever) can then be checked manually one-time. When the check is passed successful, you should save the pattern and also the results as a master lookup. Now you can process this master each time again in the billing system and the results should be always the same. The comparison of the master output and the new output then can be done automatically, i.e. you simply can compare the MD5 checksums of the files.
So the process should look like this:
When your check patterns have been defined and validated, you can pass this each time through the billing system and the results should always be the same. This is a simple mechanism to test that nothing has been changed in the system which causes any problems. The test patterns can for sure be used in any instance (system test, UAT, Dry Run, Production) but at least they should be used in the production system before each production bill run. The test patterns should be defined not only for customers/accounts but also for various usage and usage types so that each scenario is covered.
So the automated compare could look like this:
The comparison of the MD5 checksums for sure should be done automatically and when the results show that there are no differences, this would mean that the system is verified and the real production data can be processed.
The time-saving is one of the benefits here and maybe you think this is the biggest benefit, but there are also some additional benefits where you maybe should also have a look to:
● elimination of human errors during the checks
As the check process will be an automated process, you can completely eliminate human errors in the check area. Whenever a person is doing a check manually there is a potential risk that the person will overlook an issue.
● time-saving leads to more quality
As you completely save the time which you former spend on the system checks, you can invest this time in other checks. So the quality of your checks will be increased significantly.
● easy regression testing
Whenever you have implemented new updates/changes/patches, you easily can verify that there are no side-effects to the exiting system. You simply can process the example patterns and then you quickly know if there are any side-effects.
It may be that it is not an easy exercise to define the sample patterns and to set them up in the system but the benefit will outweigh. You can save a lot of time and for sure you can optimize the quality of your checks. To follow my idea, you need some talented people to analyse and to define the test patterns but this normally should be given in a big company.
Another topic which maybe should be reviewed after implementing the improvement, is the point where checks should be placed. The question here is if the invoice creation is the right place to identify human errors during the data entry in the system. I would recommend that this should be done at a really early stage within the order flow to ensure that typos can be found as soon as possible and not at the end of the order flow.
All this is only a simple idea which is growing in my mind over the month and I will see if I get the chance to implement this later as part of a project.