How to decode .dxf files?

I would like to convert the drawings inside a .dxf file into g-code. There are tools doing that but I would like to code it myself. So, the very first part is to decode the .dxf format. Yet, the contents of the .dxf file does not look easy to decipher.

I downloaded a .dxf file here and opened it in a text editor.

I am also referring to this manual. It looks like what is inside a .dxf file is mostly style and configuration information and I tend to omit almost everything. So, 1. can you specify attributes that should not be omitted, if there are any?

As far as I know the figures are divided into multiple ENTITIES in a .dxf file. Therefore, I am only copy pasting the SECTION of ENTITIES. Note that there are 6 SECTIONS in the file and the last section (~~BLOCKS~~ OBJECTS) is the longest one although I don't know what that part represents (Would be nice if you could explain).

In the code below, 10 and 20 should be representing X and Y positions and 42 should be representing the bulge. It is kind of possible to track the polyline. I consider extracting information from the file by using the TITLES as navigation points as well as the numbers like 10, 20 and 42. But there are two polylines below. So, 2. which polyline should I take into consideration and what is the purpose of the other?

  0
SECTION
  2
ENTITIES
  0
LWPOLYLINE
  5
72    # What
330   # are
1F    # these
100   # numbers?
AcDbEntity
  8
Layer 1
100
AcDbPolyline
 90
       12
 70
     1
 43    # Constant width (optional; default = 0)
0.0
 10
11.7511418685121
 20
14.9867256637168
 42
1.0
 10
3.31114186851211
 20
14.9867256637168
 10
-0.0132743362831871
 20
14.9867256637168
 10
-0.0132743362831871
 20
11.72
 42
1.0
 10
-0.0132743362831871
 20
3.28
 10
-0.0132743362831871
 20
0.0398230088495577
 10
3.28
 20
0.039823008849557
 42
1.0
 10
11.72
 20
0.0398230088495577
 10
15.0132743362832
 20
0.0398230088495577
 10
15.0132743362832
 20
3.28
 42
1.0
 10
15.0132743362832
 20
11.72
 10
15.0132743362832
 20
14.9867256637168
  0
LWPOLYLINE
  5
73
330
1F
100
AcDbEntity
  8
Layer 1
100
AcDbPolyline
 90
       12
 70
     1
 43
0.0
 10
12.6544611051008
 20
15.9867256637168
 10
16.0132743362832
 20
15.9867256637168
 10
16.0132743362832
 20
12.6233192365887
 42
-0.823684764724874
 10
16.0132743362832
 20
2.37668076341128
 10
16.0132743362832
 20
-0.960176991150442
 10
12.6233192365887
 20
-0.960176991150442
 42
-0.823684764724874
 10
2.37668076341128
 20
-0.960176991150443
 10
-1.01327433628319
 20
-0.960176991150442
 10
-1.01327433628319
 20
2.37668076341128
 42
-0.823684764724874
 10
-1.01327433628319
 20
12.6233192365887
 10
-1.01327433628319
 20
15.9867256637168
 10
2.40782263192339
 20
15.9867256637168
 42
-0.823684764724874
  0
ENDSEC

Related Questions

What fields to take into consideration in a .dxf file to obtain an accurate G-Code?

Solution

The last section (BLOCKS) is the longest one although I don't know what that part represents (Would be nice if you could explain).

The purpose of the BLOCKS section is summaried in the manual you referred to:

The BLOCKS section contains an entry for each block reference in the drawing.

Think of a block as a group of entities that are grouped together as one element. The block has:

Origin
Rotation
Scale

Such blocks are referenced in the drawing itself and each instance of the block is called an INSERT.

So when you walk your ENTITIES section and you hit an INSERT entity, you then have to find its handle in the BLOCK table and process the elements accordingly.

There are some DXF codes that are common to many entities and they are not always listed with the information for a specific entity type (like LWPOLYLINE).

Look at this complete list for those numbers:

5: Entity handle; text string of up to 16 hexadecimal digits (fixed)

330: Soft-pointer handle; arbitrary soft pointers to other objects within same DXF file or drawing. Translated during INSERT and XREF operations

100: Subclass data marker (with derived class name as a string). Required for all objects and entity classes that are derived from another concrete class. The subclass data marker segregates data defined by different classes in the inheritance chain for the same object. This is in addition to the requirement for DXF names for each distinct concrete class derived from ObjectARX (see Subclass Markers)

This page is also useful.

Why are there 2 LWPOLYLINES in the first place and why is it not only one BLOCK-ENDBLK pair?

If you read through the section about BLOCKS you'll see:

Model Space and Paper Space Block Definitions

Three empty definitions always appear in the BLOCKS section. They are titled *Model_Space, *Paper_Space and *Paper_Space0. These definitions manifest the representations of model space and paper space as block definitions internally. The internal name of the first paper space layout is *Paper_Space, the second is *Paper_Space0, the third is *Paper_Space1, and so on.