생각하는 일상

2016. 11. 15. 16:29, Computer/Algorithm

How to Read Big File in C++

C++ 대용량 파일 읽기

레퍼런스를 찾으면 보통 다음과 같이 코드를 작성할 것을 권한다.

void fileread(const char* _path){
    pFile = fopen(_path, "rb");

    //read size of file
    fseek(pFile, 0, SEEK_END);
    long lSize = ftell(pFile);
    fseek(pFile, 0, SEE`K_SET);

    char *buff = (char*)malloc(sizeof(char)*lSize);

    unsigned int result;

    //read all
    result = fread(&buff[totnum],sizeof(char),lSize,pFile);
    if (result != lSize) {
        cout << "not read all file" << endl;
    }

//처리

free(buff)
}

그러나 이번에 진행했던 프로젝트는 검색 엔진을 만드는 것인데 55만 개의 문서를 인덱싱한 파일의 크기는 2.4GB였다. 이렇게 파일 용량이 큰 경우 fread로 한 번에 읽는 것은 불가능하다. fread는 얼만큼 읽었는지 return하므로 read가 끊긴 부분부터 다시 읽기 시작하면 다음과 같이 대용량 파일을 한 번에 읽을 수 있다.

void fileread(const char* _path){
    pFile = fopen(_path, "rb");

    //read size of file
    fseek(pFile, 0, SEEK_END);
    long lSize = ftell(pFile);
    fseek(pFile, 0, SEE`K_SET);

    char *buff = (char*)malloc(sizeof(char)*lSize);

    unsigned int totnum = 0;
    unsigned int curnum = 0;

    //read all big file
    while ((curnum = fread(&buff[totnum], sizeof(char), lSize - totnum, pFile)) > 0) {
        totnum += curnum;
    }

    if (totnum != lSize) {
        cout << "not read all file" << endl;
    }
}

그렇다면 왜 파일을 한 번에 읽어야 할까?

이유는 간단한데, 파일 입출력을 하는 것은 매우 느려서 한 글자씩, 한 줄씩 읽도록 프로그램을 짤 경우 비효율적인 프로그램을 만들게 되기 때문이다.

저작자표시 (새창열림)

'Computer > Algorithm' 카테고리의 다른 글

JSON 포맷 스트링 예쁘게 출력하는 코드(JsonPrettyPrint) (0)	2017.08.07
C++ 길이가 다른 라인 스트링 분할 Strtok 예제 (0)	2016.11.16
How to convert string buffer to int(float) fast in C++(고정된 길이 문자열을 숫자로 빠르게 변환) (0)	2016.11.15
c++ std 이용하여 string을 unsigned int로 변환하기 (0)	2016.10.30
c++ std 이용하여 string 분할하기 (0)	2016.10.03

Comments, Trackbacks

PREV 1 ··· 31 32 33 34 35 36 37 ··· 60 NEXT

일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

'Computer > Algorithm' 카테고리의 다른 글

티스토리툴바