vb.netcsvstreamblobtextfieldparser

Parsing CSV from stream with TextFieldParser always reaches EndOfData


During parsing CSV file as a stream from Azure Blob, TextFieldParser always reaches EndOfData immediately, without any data read. The same code, but with the path to same physical file instead of stream works.

    Dim storageAccount As CloudStorageAccount = CloudStorageAccount.Parse(AzureStorageConnection)
    Dim blobClient As CloudBlobClient = storageAccount.CreateCloudBlobClient()
    Dim BlobList As IEnumerable(Of CloudBlockBlob) = blobClient.GetContainerReference("containername").ListBlobs().OfType(Of CloudBlockBlob)

    For Each blb In BlobList
        Dim myList As New List(Of MyBusinessObject)

        Using memoryStream = New MemoryStream()
            blb.DownloadToStream(memoryStream)

            Using Reader As New FileIO.TextFieldParser(memoryStream)
                Reader.TextFieldType = FileIO.FieldType.FixedWidth
                Reader.SetFieldWidths(2, 9, 10)
                Dim currentRow As String()
                While Not Reader.EndOfData
                    Try
                        currentRow = Reader.ReadFields()
                        myList.Add(New GsmXFileRow() With {
                        ' code to read currentRow and add elements to myList
                        })
                    Catch ex As FileIO.MalformedLineException
                    End Try
                End While
            End Using
        End Using
    Next

I have also tried to convert MemoryStream to TextReader

Dim myTextReader As TextReader = New StreamReader(memoryStream)

and then passing myTextReader into TextFieldParser, but this does not work either.

Using Reader As New FileIO.TextFieldParser(myTextReader)


Solution

  • I see this:

    Value of Length property equals file size

    and this:

    'Position` property has same value

    That means at the start of the loop, the MemoryStream has already advanced to the end of the stream. Just set Position back to 0, and you should be in a better place.

    However, there may be another issue here, too. That stream data is binary with some unknown encoding. The TextFieldParser wants to work with Text. You need a way to give the TextFieldParser information about what encoding is used.

    In this case, I recommend a StreamReader. This type inherits from TextReader, so you can use it with the TextFieldParser :

    Dim storageAccount As CloudStorageAccount = CloudStorageAccount.Parse(AzureStorageConnection)
    Dim blobClient As CloudBlobClient = storageAccount.CreateCloudBlobClient()
    Dim BlobList As IEnumerable(Of CloudBlockBlob) = blobClient.GetContainerReference("containername").ListBlobs().OfType(Of CloudBlockBlob)
    
    Dim myList As New List(Of MyBusinessObject)
    For Each blb In BlobList
    
        'Several constructor overloads allow you to specify the encoding here
        Using blobData As New StreamReader(New MemoryStream())
            blb.DownloadToStream(blobData.Stream)
    
            'Fix the position problem
            blobData.Stream.Position = 0
    
            Using Reader As New FileIO.TextFieldParser(blogData)
                Reader.TextFieldType = FileIO.FieldType.FixedWidth
                Reader.SetFieldWidths(2, 9, 10)
                Dim currentRow As String() = Reader.ReadFields()
                While Not Reader.EndOfData
                    Try
                        myList.Add(New GsmXFileRow() With {
                            ' code to read currentRow and add elements to myList
                        })
                        currentRow = Reader.ReadFields()
                    Catch ex As FileIO.MalformedLineException
                    End Try
                End While
            End Using 
        End Using
    Next