TagsProvider is a tool for extracting HTML tags from a string, in event-driven way. Helps to extract text, structured data, from a specific site.
Go to file
Alexander Kozachenko b9beb96ca2 Updated nuspec 2023-12-08 09:55:59 +03:00
pack Updated nuspec 2023-12-08 09:55:59 +03:00
src Release 2.0.0 2023-12-08 07:00:38 +03:00
tests refactoring #6 (#7) 2023-12-08 01:46:37 +00:00
Changelog.md moved to src 2023-11-06 19:32:19 +03:00
License moved to src 2023-11-06 19:32:19 +03:00
Readme.md Release 2.0.0 2023-12-08 07:00:38 +03:00
src.code-workspace Release v1.0.2 2023-11-07 16:51:25 +03:00
src.sln Release v1.0.2 2023-11-07 16:51:25 +03:00

Readme.md

ProSol.Html.TagsProvider

TagsProvider is a tool for extracting HTML tags from a string, in event-driven way. Helps to extract text, structured data, from a specific site.

How to use?

Install the package:

dotnet add package ProSol.Html.TagsProvider --version 2.0.0-rc1.3

Fetch some html:
```csharp
var url = "https://en.wikipedia.org/wiki/Food_energy";
var html = HtmlSource.GetHtmlAsync(url).Result;

Process all a tag:

var provider = new TagsProvider();
var data = new DataSubscriber<string>();

provider
    .Endpoint(x => x.CurrentTag.TagInfo.Name == "a")
    .Translate(x => html[x.CurrentTag.InnerTextRange])
    .Subscribe(data);

provider.Process(html);

foreach (var item in data.Messages)
{
    Console.WriteLine(item);
}
internal static class HtmlSource
{
    internal static async Task<string> GetHtmlAsync(string url)
    {
        using var client = new HttpClient();
        using var response = await client.GetAsync(url);
        return await response.Content.ReadAsStringAsync();
    }
}

That's it! The provider notifies about any tag met and its data:

  • name,
  • range of entire tag,
  • range of inner content.

More demos here.

Happy coding!