提取多个word中的表格信息填入不同Excel的sheet

提取多个word中表格内容到Excel。

提取一个word文件到Excel的不同sheet已经能跑,目前卡在用glob读取文件夹下word中的表格。

不同sheet是word表格中的不同信息。

报错显示:name ‘tables’ is not defined


doc_ls=[]
dic1={}
for path in glob.glob('文件夹\*.docx'):
    doc = Document(path)
    for table in doc.tables:
        for i, row in enumerate(table.rows[1:7]): 
            row_content = []
            for cell in row.cells[:]: 
                c = cell.text
                row_content.append(c)
                print (row_content)
table = tables[0]
finaldict1 = {}
n = 5
for i, row in enumerate(table.rows[n:n+1]):

    row_content = []
    for cell in row.cells[1:6]: 
        c = cell.text
        row_content.append(c)
    allcol = row_content
for c in allcol:

    finaldict1[c] = []
n += 1

python小白。

讨论数量: 6
Jason990420

代码格式错误, 缩格不对, 请参考以下内容, 再重新编辑内容.

代码高亮

```python

你的代码

```

  • What libraries you used for glob and Document
  • Where you define the variable tables before you use it in table = tables[0] ?
1年前 评论
JING123345567 (楼主) 1年前
Jason990420

用 glob 读取文件夹下 word 中的表格没问题啊 ! 不清楚你后面的代码是作什么 ?!

以下是我提供的代码

import glob
from docx import Document
import xlsxwriter

with xlsxwriter.Workbook('文件夹/Tables.xlsx') as workbook:
    for path in glob.glob('文件夹/*.docx'):
        doc = Document(path)
        for table in doc.tables:
            worksheet = workbook.add_worksheet()
            for row_num, row in enumerate(table.rows):
                row_content = []
                for cell in row.cells:
                    row_content.append(cell.text)
                worksheet.write_row(row_num, 0, row_content)
1年前 评论
JING123345567 (楼主) 1年前
Jason990420

Not checked if it work or correct.

writer = pd.ExcelWriter("./test.xlsx", engine='xlsxwriter')

i = 0
for wordDoc in worddocs_list:
    for table in wordDoc.tables:
        lst = []
        for row in table.rows[5:6]:
            row_lst = []
            for cell in row.cells[1:7]:
                c = cell.text
                row_lst.append(c)
            lst.append(row_lst)
        df = pd.DataFrame()
        N = 6
        split_list = np.array_split(lst, N, axis=1)
        for i in range(N):
            df[f'Column_{i}'] = split_list[i]
        print(df)
        df.to_excel(writer, sheet_name=f'Sheet{i}')
        i += 1

writer.save()
1年前 评论
JING123345567 (楼主) 1年前

讨论应以学习和精进为目的。请勿发布不友善或者负能量的内容,与人为善,比聪明更重要!