opencvmachine-learningocrtext-extractioncascade-classifier

Is there a way to use OCR to extract specific data from a CAD technical drawing?


I'm trying to use OCR to extract only the base dimensions of a CAD model, but there are other associative dimensions that I don't need (like angles, length from baseline to hole, etc). Here is an example of a technical drawing. (The numbers in red circles are the base dimensions, the rest in purple highlights are the ones to ignore.) How can I tell my program to extract only the base dimensions (the height, length, and width of a block before it goes through the CNC)?

The issue is that the drawings I get are not in a specific format, so I can't tell the OCR where the dimensions are. It has to figure out on its own contextually.

Should I train the program through machine learning by running several iterations and correcting it? If so, what methods are there? The only thing I can think of are Opencv cascade classifiers. Or are there other methods to solving this problem? Sorry for the long post. Thanks.


Solution

  • I feel you... it's a very tricky problem, and we spent the last 3 years finding a solution for it. Forgive me for mentioning the own solution, but it will certainly solve your problem: pip install werk24

    
    from werk24 import Hook, W24AskVariantMeasures
    from werk24.models.techread import W24TechreadMessage
    from werk24.utils import w24_read_sync
        
    from . import get_drawing_bytes # define your own
        
        
    def recv_measures(message: W24TechreadMessage) -> None:
        for cur_measure in message.payload_dict.get('measures'):
            print(cur_measure)
        
    if __name__ == "__main__":
        # define what information you want to receive from the API
        # and what shall be done when the info is available.
        hooks = [Hook(ask=W24AskVariantMeasures(), function=recv_measures)]
        
        # submit the request to the Werk24 API
        w24_read_sync(get_drawing_bytes(), hooks)
    
    

    In your example it will return for example the following measure

        {
            "position": <STRIPPED>
            "label": {
                "blurb": "ø30 H7 +0.0210/0",
                "quantity": 1,
                "size": {
                    "blurb": "30",
                    "size_type":" "DIAMETER",
                    "nominal_size": "30.0",
                },
                "unit": "MILLIMETER",
                "size_tolerance": {
                    "toleration_type": "FIT_SIZE_ISO",
                    "blurb": "H7",
                    "deviation_lower": "0.0",
                    "deviation_upper": "0.0210",
                    "fundamental_deviation": "H",
                    "tolerance_grade": {
                        "grade":7,
                        "warnings":[]
                    },
                "thread": null,
                "chamfer": null,
                "depth":null,
                "test_dimension": null,
             },
             "warnings": [],
             "confidence": 0.98810
        }
    

    or for a GD&T

    {
        "position": <STRIPPED>,
        "frame": {
            "blurb": "[⟂|0.05|A]",
            "characteristic": "⟂",
            "zone_shape": null,
            "zone_value": {
                "blurb": "0.05",
                "width_min": 0.05,
                "width_max": null,
                "extend_quantity": null,
                "extend_shape": null,
                "extend": null,
                "extend_angle": null
            },
            "zone_combinations": [],
            "zone_offset": null,
            "zone_constraint": null,
            "feature_filter": null,
            "feature_associated": null,
            "feature_derived": null,
            "reference_association": null,
            "reference_parameter": null,
            "material_condition": null,
            "state": null,
            "data": [
                {
                    "blurb": "A"
                }
             ]
        }
    }
    

    Check the documentation on Werk24 for details.