2023-11-07 14:51:25 +01:00
|
|
|
# ProSol.Html.TagsProvider
|
2023-11-06 17:32:19 +01:00
|
|
|
|
2023-11-07 14:51:25 +01:00
|
|
|
TagsProvider is a tool for extracting HTML tags from a string, in event-driven way.
|
|
|
|
Helps to extract text, structured data, from a specific site.
|
2023-11-06 17:32:19 +01:00
|
|
|
|
2023-11-07 14:51:25 +01:00
|
|
|
## How to use?
|
2023-11-06 17:32:19 +01:00
|
|
|
|
2023-11-07 14:51:25 +01:00
|
|
|
Install the package:
|
|
|
|
```sh
|
|
|
|
dotnet add package ProSol.Html.TagsProvider
|
|
|
|
```
|
2023-11-06 17:32:19 +01:00
|
|
|
|
2023-11-07 14:51:25 +01:00
|
|
|
Make an Observer:
|
|
|
|
```csharp
|
|
|
|
internal class ConsoleLogObserver : IObserver<TagsProviderMessage>
|
|
|
|
{
|
|
|
|
public void OnCompleted() { }
|
2023-11-06 17:32:19 +01:00
|
|
|
|
2023-11-07 14:51:25 +01:00
|
|
|
public void OnError(Exception error) { }
|
2023-11-06 17:32:19 +01:00
|
|
|
|
2023-11-07 14:51:25 +01:00
|
|
|
public void OnNext(TagsProviderMessage value)
|
|
|
|
{
|
|
|
|
Console.WriteLine(value.CurrentTag.TagInfo.Name);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
```
|
2023-11-06 17:32:19 +01:00
|
|
|
|
2023-11-07 14:51:25 +01:00
|
|
|
Run the TagsProvider:
|
|
|
|
```csharp
|
|
|
|
var provider = new TagsProvider();
|
|
|
|
using var unsub = provider.Subscribe(new ConsoleLogObserver());
|
|
|
|
provider.Process("<div> <span> </span> </div>");
|
|
|
|
```
|
|
|
|
|
|
|
|
Get the output:
|
|
|
|
```
|
|
|
|
span
|
|
|
|
div
|
|
|
|
```
|
|
|
|
|
|
|
|
That's it!
|
|
|
|
The provider notifies about any tag met and its data:
|
|
|
|
- name,
|
|
|
|
- range of entire tag,
|
|
|
|
- range of inner content.
|
|
|
|
|
|
|
|
More demos [here](https://git.disroot.org/alexenko/Demos/src/branch/master/ProSol.TagsProvider).
|