pythonalgorithmbarcodecheck-digit

Create check digit function


I'm trying to create check digits and append them after the original UPCs. Here's the sample data

Because there are leading 0's, I have to read the data as strings first:

import pandas as pd                                                                 
upc = pd.read_csv("/Users/lee/Desktop/upc.csv", dtype = str)

Here's an example of the check digit algorithm:
If upc is 003459409000
step (1) 0 + 3*0 + 3 + 3*4 + 5 + 3*9 + 4 + 3*0 + 9 + 3*0 + 0 + 3*0 = 60
step (2) 60 mod 10 = 0
step (3) check digit = 0 (if it's not 0, then check digit = 10 - number in step 2)

Based on the algorithm, here's the code:

def add_check_digit(upc_str):  
    upc_str = str(upc_str)
    if len(upc_str) != 12: 
        raise Exception("Invalid length")

    odd_sum = 0
    even_sum = 0 
    for i, char in enumerate(upc_str): 
        j = i+1 
        if j % 2 == 0: 
            even_sum += int(char) 
        else:
            odd_sum += int(char) 
    total_sum = (even_sum * 3) + odd_sum 
    mod = total_sum % 10 
    check_digit = 10 - mod 
    if check_digit == 10: 
    check_digit = 0 
    return upc_str + str(check_digit) 

If I run this code, it gives correct check digit and appends this result to the end of the original UPC. For the example above, if I type:

add_check_digit('003459409000')

The output gives 13-digit UPC 0034594090000.

Now my questions are:

  1. This function works only for a single upc, i.e., I have to copy/paste each single upc and get the check digit. How do I create a function that works for a list of UPSs in a dataframe? Each result should return a 13-digit UPC with the check digits appended after the original UPC.

  2. The UPCs are read as strings. How do I apply the function to the UPCs? I suppose I should convert the strings to numbers somehow.

  3. After I get the new UPCs, how do I save the result in a csv file?


Solution

  • data set up for me as I don't have CSV file, below step is the same as your

    df = pd.read_csv("/Users/lee/Desktop/upc.csv", dtype = str)
    

    data setup

    import pandas as pd
    df=pd.DataFrame({"upc_in_file":['003459409000','003459409001','003459409002']})
    
    def add_check_digit(upc_str):  
        upc_str = str(upc_str)
        if len(upc_str) != 12: 
            raise Exception("Invalid length")
    
        odd_sum = 0
        even_sum = 0 
        for i, char in enumerate(upc_str): 
            j = i+1 
            if j % 2 == 0: 
                even_sum += int(char) 
            else:
                odd_sum += int(char) 
                total_sum = (even_sum * 3) + odd_sum 
                mod = total_sum % 10 
                check_digit = 10 - mod 
            if check_digit == 10: 
                check_digit = 0 
        return upc_str + str(check_digit) 
    

    apply the above function to the upc column(the one which was read from file)

    df['new_upc']=df['upc_in_file'].apply(add_check_digit)
    

    now save the file!

    df.to_csv("my_updated_upc.csv")
    

    this will look like enter image description here