Skip to main content
Open In Colab在 GitHub 上打开

乳齿象

Mastodon 是一个联合的社交媒体和社交网络服务。

此加载器从 list of the list 的 “toots” 中获取文本Mastodonaccounts,使用Mastodon.pyPython 包。

默认情况下,公共帐户可以在没有任何身份验证的情况下进行查询。如果查询非公共账户或实例,您必须为您的账户注册一个应用程序,该应用程序会为您提供访问令牌,并设置该令牌和您账户的 API 基本 URL。

然后你需要传入你想要提取的 Mastodon 账户名,在@account@instance格式。

from langchain_community.document_loaders import MastodonTootsLoader
API 参考:MastodonTootsLoader
%pip install --upgrade --quiet  Mastodon.py
loader = MastodonTootsLoader(
mastodon_accounts=["@Gargron@mastodon.social"],
number_toots=50, # Default value is 100
)

# Or set up access information to use a Mastodon app.
# Note that the access token can either be passed into
# constructor or you can set the environment "MASTODON_ACCESS_TOKEN".
# loader = MastodonTootsLoader(
# access_token="<ACCESS TOKEN OF MASTODON APP>",
# api_base_url="<API BASE URL OF MASTODON APP INSTANCE>",
# mastodon_accounts=["@Gargron@mastodon.social"],
# number_toots=50, # Default value is 100
# )
documents = loader.load()
for doc in documents[:3]:
print(doc.page_content)
print("=" * 80)
<p>It is tough to leave this behind and go back to reality. And some people live here! I’m sure there are downsides but it sounds pretty good to me right now.</p>
================================================================================
<p>I wish we could stay here a little longer, but it is time to go home 🥲</p>
================================================================================
<p>Last day of the honeymoon. And it’s <a href="https://mastodon.social/tags/caturday" class="mention hashtag" rel="tag">#<span>caturday</span></a>! This cute tabby came to the restaurant to beg for food and got some chicken.</p>
================================================================================

嘟文文本(文件的page_content) 默认是 Mastodon API 返回的 HTML。