I'm trying to use OCR to extract only the base dimensions of a CAD model, but there are other associative dimensions that I don't need (like angles, length from baseline to hole, etc). Here is an example of a technical drawing. (The numbers in red circles are the base dimensions, the rest in purple highlights are the ones to ignore.) How can I tell my program to extract only the base dimensions (the height, length, and width of a block before it goes through the CNC)?
The issue is that the drawings I get are not in a specific format, so I can't tell the OCR where the dimensions are. It has to figure out on its own contextually.
Should I train the program through machine learning by running several iterations and correcting it? If so, what methods are there? The only thing I can think of are Opencv cascade classifiers. Or are there other methods to solving this problem? Sorry for the long post. Thanks.
I feel you... it's a very tricky problem, and we spent the last 3 years finding a solution for it. Forgive me for mentioning the own solution, but it will certainly solve your problem: pip install werk24
from werk24 import Hook, W24AskVariantMeasures
from werk24.models.techread import W24TechreadMessage
from werk24.utils import w24_read_sync
from . import get_drawing_bytes # define your own
def recv_measures(message: W24TechreadMessage) -> None:
for cur_measure in message.payload_dict.get('measures'):
print(cur_measure)
if __name__ == "__main__":
# define what information you want to receive from the API
# and what shall be done when the info is available.
hooks = [Hook(ask=W24AskVariantMeasures(), function=recv_measures)]
# submit the request to the Werk24 API
w24_read_sync(get_drawing_bytes(), hooks)
In your example it will return for example the following measure
{
"position": <STRIPPED>
"label": {
"blurb": "ø30 H7 +0.0210/0",
"quantity": 1,
"size": {
"blurb": "30",
"size_type":" "DIAMETER",
"nominal_size": "30.0",
},
"unit": "MILLIMETER",
"size_tolerance": {
"toleration_type": "FIT_SIZE_ISO",
"blurb": "H7",
"deviation_lower": "0.0",
"deviation_upper": "0.0210",
"fundamental_deviation": "H",
"tolerance_grade": {
"grade":7,
"warnings":[]
},
"thread": null,
"chamfer": null,
"depth":null,
"test_dimension": null,
},
"warnings": [],
"confidence": 0.98810
}
or for a GD&T
{
"position": <STRIPPED>,
"frame": {
"blurb": "[⟂|0.05|A]",
"characteristic": "⟂",
"zone_shape": null,
"zone_value": {
"blurb": "0.05",
"width_min": 0.05,
"width_max": null,
"extend_quantity": null,
"extend_shape": null,
"extend": null,
"extend_angle": null
},
"zone_combinations": [],
"zone_offset": null,
"zone_constraint": null,
"feature_filter": null,
"feature_associated": null,
"feature_derived": null,
"reference_association": null,
"reference_parameter": null,
"material_condition": null,
"state": null,
"data": [
{
"blurb": "A"
}
]
}
}
Check the documentation on Werk24 for details.