Introduction
I recently had the need to import and export CSV files in an MFC application. A CSV (Comma-Separated Values) file is a plain-text file where each row contains one or more fields, separated by commas.
CSV files are probably best known for their use by Microsoft Excel. CSV files provide a convenient format for sharing spreadsheet data between applications, particularly when you consider having your application work directly with native Excel files would be a very complex task.
The CSV file format is not complex. Mostly, it just requires some simple parsing. One trick is when a data field contains a comma. Since commas are used to delimit fields, this would cause problems and so such fields are enclosed in double quotes. And since double quotes have special meaning, we have problems if a data field contains a double quote and so such fields are also enclosed in double quotes, and pairs of double quotes are interpreted to mean a single double quote in the data.
My CCSVFile Class
The header file for the CCSVFile class is shown in Listing 1 and my source file is shown in listing 2.
Listing 1: CSVFile.h
#pragma once
#include "afx.h"
class CCSVFile : public CStdioFile
{
public:
enum Mode { modeRead, modeWrite };
CCSVFile(LPCTSTR lpszFilename, Mode mode = modeRead);
~CCSVFile(void);
bool ReadData(CStringArray &arr);
void WriteData(CStringArray &arr);
#ifdef _DEBUG
Mode m_nMode;
#endif
};
The class is very simple: it only contains two methods (besides the constructor and destructor).
Note that it is up to the caller to ensure that only the ReadData() method is called when the constructor specified read mode, and only the WriteData() method is called when the constructor specified write mode. To help enforce this, the code asserts if it is not the case when _DEBUG is defined.
Listing 2: CSVFile.cpp
#include "StdAfx.h"
#include "CSVFile.h"
CCSVFile::CCSVFile(LPCTSTR lpszFilename, Mode mode)
: CStdioFile(lpszFilename, (mode == modeRead) ?
CFile::modeRead|CFile::shareDenyWrite|CFile::typeText
:
CFile::modeWrite|CFile::shareDenyWrite|CFile::modeCreate|CFile::typeText)
{
#ifdef _DEBUG
m_nMode = mode;
#endif
}
CCSVFile::~CCSVFile(void)
{
}
bool CCSVFile::ReadData(CStringArray &arr)
{
// Verify correct mode in debug build
ASSERT(m_nMode == modeRead);
// Read next line
CString sLine;
if (!ReadString(sLine))
return false;
LPCTSTR p = sLine;
int nValue = 0;
// Parse values in this line
while (*p != '\0')
{
CString s; // String to hold this value
if (*p == '"')
{
// Bump past opening quote
p++;
// Parse quoted value
while (*p != '\0')
{
// Test for quote character
if (*p == '"')
{
// Found one quote
p++;
// If pair of quotes, keep one
// Else interpret as end of value
if (*p != '"')
{
p++;
break;
}
}
// Add this character to value
s.AppendChar(*p++);
}
}
else
{
// Parse unquoted value
while (*p != '\0' && *p != ',')
{
s.AppendChar(*p++);
}
// Advance to next character (if not already end of string)
if (*p != '\0')
p++;
}
// Add this string to value array
if (nValue < arr.GetCount())
arr[nValue] = s;
else
arr.Add(s);
nValue++;
}
// Trim off any unused array values
if (arr.GetCount() > nValue)
arr.RemoveAt(nValue, arr.GetCount() - nValue);
// We return true if ReadString() succeeded--even if no values
return true;
}
void CCSVFile::WriteData(CStringArray &arr)
{
static TCHAR chQuote = '"';
static TCHAR chComma = ',';
// Verify correct mode in debug build
ASSERT(m_nMode == modeWrite);
// Loop through each string in array
for (int i = 0; i < arr.GetCount(); i++)
{
// Separate this value from previous
if (i > 0)
WriteString(_T(","));
// We need special handling if string contains
// comma or double quote
bool bComma = (arr[i].Find(chComma) != -1);
bool bQuote = (arr[i].Find(chQuote) != -1);
if (bComma || bQuote)
{
Write(&chQuote, sizeof(TCHAR));
if (bQuote)
{
for (int j = 0; j < arr[i].GetLength(); i++)
{
// Pairs of quotes interpreted as single quote
if (arr[i][j] == chQuote)
Write(&chQuote, sizeof(TCHAR));
TCHAR ch = arr[i][j];
Write(&ch, sizeof(TCHAR));
}
}
else
{
WriteString(arr[i]);
}
Write(&chQuote, sizeof(TCHAR));
}
else
{
WriteString(arr[i]);
}
}
WriteString(_T("\n"));
}
There are many ways to go about parsing text. I was more comfortable manually stepping through the text, character-by-character and so that's what the code does.
As implied by the names, ReadData() is used to read from a CSV file while WriteData() is used to write to one. Each line of data is stored in a CStringArray[], which is passed by reference to both functions. The caller must call these methods once for each line. When calling ReadData(), false is returned when the end of the file is reached.
Conclusion
As mentioned, the code is fairly simple but I've actually found I've used this code on several occasions and even ported it to C#. Perhaps you will find it useful as well.
End-User License
Use of this article and any related source code or other files is governed
by the terms and conditions of
.
Author Information
Jonathan Wood
I'm a software/website developer working out of the greater Salt Lake City area in Utah. I've developed many websites including Black Belt Coder, Insider Articles, and others.