대상 폴더안의 파일에서 문자열 검색

# -*- coding: utf-8 -*-

# 여러종류의 파일 encoding 이 존재하는 폴더를 검색 할때는 encoding별로 검색해야 제대로 검색된다.

import pandas as pd
import os
import hanja
from hanja import hangul
from datetime import datetime

file_encodings = ["cp949","utf8"]
datetime = datetime.now().strftime("%Y%m%d%H%M00")

target_texts = [] #검색할 문자열이나 배열
except_files = [] #제외할 파일명

def search_text(file_path,encoding_word):
    # print(file_path,file=save_file)

    print_file_path = ""

    target_file = open(file_path, mode="r", encoding=encoding_word)

    is_exists = False
    exists_array = []

    try:
        lines = target_file.readlines()

        for index, line in enumerate(lines):            
            for taret_text in target_texts:
                if taret_text in line:
                    is_exists = True
                    exists_array.append("{0} - {1}".format((index+1), line.lstrip()))
                else:
                    is_exists = False
                    
        if len(exists_array) > 0:
            print(file_path,file=save_file)
            for exists_item in exists_array:
                print("{0}".format(exists_item),file=save_file)

        target_file.close()
    except UnicodeDecodeError as ude:
        target_file = open(file_path, mode="r", encoding="utf8")
    finally:
        # print("exception : {0}".format(target_file),file=save_file)    
        pass

#개별 폴더
root_dir = r"{{ROOT_DIR}}"

save_file_path = r"{{SAVE_FILE_FOLDER}}\find_text_result-{0}.txt".format(datetime)

with open(save_file_path,"w+",encoding="utf8") as save_file:

    for encoding_word in file_encodings:
        print("encoding : {0} \n".format(encoding_word), file=save_file)

        for (root, dirs, files) in os.walk(root_dir):
            if len(files) > 0:
                for file_name in files:
                    filename, fileExtension = os.path.splitext(file_name)

                    if fileExtension in (".java",".jsp",".js",".html") and file_name not in except_files:
                        trans_file_path = "{0}\{1}".format(root,file_name)

                        search_text(trans_file_path, encoding_word)            #검색 함수 실행
save_file.close()

python 3.6 에서 테스트 됐습니다.

저작자표시 (새창열림)

'Web_Application > python' 카테고리의 다른 글

분기복리 + 추가불입 (0)	2020.11.17
python - DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working (0)	2020.05.12
python - 예외가 발생했습니다. AttributeErrormodule 'cv2.cv2' has no attribute 'xfeatures2d' (0)	2020.05.12
python 텍스트 한문을 한글로 변환하기 (0)	2020.05.12
visual studio code configured debug type 'python' is not supported (0)	2019.09.03

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

KSK의 IT 블로그

대상 폴더안의 파일에서 문자열 검색

'Web_Application > python' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역

대상 폴더안의 파일에서 문자열 검색

'Web_Application > python' 카테고리의 다른 글

'Web_Application/python' Related Articles

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역