需要为2021年python库大全的post-content代码生成目录
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| from bs4 import BeautifulSoup
with open('C:/Users/tellw/hexo/source/docs/2021-python-libraries.html',encoding='utf8') as f: text=f.read() soup=BeautifulSoup(text,'html.parser') h1=soup.find('h1') print('<div class="post-toc"><ol class="toc">') print(f'<li class="toc-item toc-level-1"><a class="toc-link" href="#{h1["id"]}"><span class="toc-text">{h1.get_text()[:-1]}</span></a>') child=False for h1n in h1.next_siblings: if h1n.name=='h1': if child: print('</ol>') child=False print(f'</li><li class="toc-item toc-level-1"><a class="toc-link" href="#{h1n["id"]}"><span class="toc-text">{h1n.get_text()[:-1]}</span></a>') elif h1n.name=='h2': if not child: print('<ol class="toc-child">') child=True print(f'<li class="toc-item toc-level-2"><a class="toc-link" href="#{h1n["id"]}"><span class="toc-text">{h1n.get_text()[:-1]}</span></a></li>') print('</li></ol></div>')
|
重点在于next_sibings得到h1和h2列表,并构造树状结构的目录代码
参考文献:[https://beautifulsoup.cn/](Beautiful Soup 4.4.0 文档)
创建于2023.2.9/12.18,修改于2023.2.9/12.19