a

article-extractor

Extract full article text and metadata from web pages

Home/Communication/article-extractor

WhatIsIt

A Claude Code skill for extracting full article text and metadata from web pages. It strips away navigation, ads, sidebars, and other non-content elements to deliver clean, readable article text. Ideal for content research, archiving, and building knowledge bases from web sources.

HowToUse

When you provide a URL, the skill automatically fetches the web page, identifies the main article content, and extracts clean text along with metadata such as title, author, publish date, and description. It handles various website layouts and content management systems.

The extracted content can be used for research, summarization, or further processing within your Claude workflow.

KeyFeatures

  • Clean text extraction from web articles, removing ads, navigation, and clutter
  • Metadata extraction including title, author, date, and description
  • Handles various website layouts and CMS platforms
  • Integrates with other Tapestry skills for content processing pipelines
  • Preserves article structure and formatting
ViewOnGitHub

GithubStats

Stars
Forks
LastUpdate
License
MIT
Version
1.0.0

Categories

Communication

Tags

web
document-processing
coding
writing
video

Features

RelatedSkills

MoreFrom