federationlib/STANDARD.md

242 lines
18 KiB
Markdown
Raw Normal View History

2023-06-04 14:23:51 -04:00
# Internet Federation Database Standard (IFDS) Version 1
IFDS (Internet Federation Database Standard) is a standard database system to which allows internet clients both
authorized and public users to query the database to check if the given content or entities involved in the
communication could be classified as spam or malicious, allowing a client to take place protections against
spam and malicious content on various communication channels such as email, social media, messaging apps, and more.
# Introduction
# Table of Contents
<!-- TOC -->
* [Internet Federation Database Standard (IFDS) Version 1](#internet-federation-database-standard-ifds-version-1)
* [Introduction](#introduction)
* [Table of Contents](#table-of-contents)
* [Definitions](#definitions)
* [Clients](#clients)
* [Client TOTP Signature](#client-totp-signature)
* [Client Object](#client-object)
* [Federation Standard](#federation-standard)
* [Standard Federated Addresses](#standard-federated-addresses)
* [Query Document](#query-document)
* [QueryDocument Versioning](#querydocument-versioning)
* [QueryDocument Subject Types](#querydocument-subject-types)
* [RECON](#recon)
* [ANALYZE](#analyze)
* [REPORT](#report)
* [QueryDocument Event Types](#querydocument-event-types)
* [QueryDocument Object](#querydocument-object)
* [QueryDocument.Peer Object](#querydocumentpeer-object)
* [QueryDocument.Content Object](#querydocumentcontent-object)
* [QueryDocument.Attachment Object](#querydocumentattachment-object)
* [QueryDocument.Report Object](#querydocumentreport-object)
* [Platforms](#platforms)
* [Telegram](#telegram)
<!-- TOC -->
# Definitions
Some definitions used in this document, these definitions are not official and are only used to provide a better
understanding of the document and the purposes behind features and fields.
- **Peer** - Everything from users, bots, channels, groups, and more is considered a peer, a peer is represented by
a standard federated address, see [Standard Federated Addresses](#standard-federated-addresses) for more information,
a peer could be a parent or child of another peer, for example, a user could be a child of a group or channel with
an association type of "member", a peer could also be a parent of another peer, for example, a group or channel
could be a parent of a user with an association type of "owner" or "admin", these associations are only made known
by trusted clients if specified in the QueryDocument, see [QueryDocument](#querydocument) for more information.
- **Discovery** - The process of using the provided information in a QueryDocument to discover & collect information
about the content or entities involved in the communication, allowing the database to keep a record of or to
have a better understanding of peer associations, content, and more. Discovery is not required for a QueryDocument
to be valid, however, it is recommended to provide as much information as possible to allow the database to
have a better understanding of the content or entities involved in the communication.
# Clients
TODO: Write this section
## Client TOTP Signature
TODO: Write this section
## Client Object
TODO: Write this section
# Federation Standard
The federation standard is a standard which defines how clients should communicate with servers, the standard
## Standard Federated Addresses
A standard federated address is a universally unique identifier which can represent multiple types of peers from
different platforms, federated address formats are standardized so only one format is required to be used by clients
and allows servers to reject requests to platforms that are not supported by the server.
![Federated Telegram User](assets/federated_telegram_user.png)
![Federated Email Address](assets/federated_email_address.png)
![Federated IPV4](assets/federated_ipv4.png)
The format of a standard federated address is as follows:
`source`.`type`:`id`
- `source` - The platform/source where the peer is from, must be a supported platform, see [Platform](#platform)
- `type` - The type of peer, peer types are unique to the platform, see the platform's documentation for more information
- `id` - The unique identifier of the peer, the identifier must be unique to the platform, see the platform's documentation for more information
Even though the value types are defined by the platform's specifications for the federated address, the end result
will always be a parsable address that can be used by the server to identify the peer.
When coming up with a federated address standard for a platform, it is important to keep in mind that `id` must be
unique to the peer on the platform, for example, if a platform has a username system it is best to avoid using something
that can be changed by the user such as a username, instead, use something that is unique to the peer such as an ID or
UUID provided by the platform, this will ensure that the federated address will always be valid and will always point
to the correct peer even if the peer changes their username or other information that isn't unique to the peer.
For a regex pattern to match a standard federated address, see the following:
```regex
(?<source>[a-z0-9]+)\.(?<type>[a-z0-9]+):(?<id>.+)
```
# Query Document
QueryDocument is a query document constructed by the client which presents the contents of the message or event to
either be put into discovery, analyzed for spam or malicious content, or report an event where a report has been
produced either by a client's automated system or by manually by a user.
## QueryDocument Versioning
The version of the QueryDocument is represented by the `version` field, the version must always be "1" for now, future
versions of the QueryDocument structure may be introduced in the future, to ensure backwards compatibility, the
version field must be checked to ensure the server can parse the document correctly. If the server does not support
the version of the document, the server must reject the document and return an error to the client.
## QueryDocument Subject Types
The subject type of the QueryDocument is represented by the `subject_type` field, the subject type must be one of the
following values:
TODO: Write this section
### RECON
![RECON Example](assets/recon_example.png)
TODO: Write this section
### ANALYZE
TODO: Write this section
### REPORT
TODO: Write this section
## QueryDocument Event Types
The event type of the QueryDocument is represented by the `event_type` field, the event type must be one of the
following values:
- **GENERAL** - The event is a general event that does not fit into any other event type, the `GENERAL` event type is
used for events such as a general update to a peer's activity or status that could be used to keep the database
up to date with the peer's activity or status.
- **INCOMING** - The event is an incoming message/request from a peer, the `INCOMING` event type is used for events such
as a message/request from a peer that was captured by the client.
- **OUTGOING** - The event is an outgoing message/request to a peer, the `OUTGOING` event type is used for events such
as a message/request to a peer that was captured by the client.
- **PEER_JOIN** - The event is a peer join event, the `PEER_JOIN` event type is used for events such when a peer joins
or connects to the `channel_peer`
- **PEER_LEAVE** - The event is a peer leave event, the `PEER_LEAVE` event type is used for events such when a peer
leaves or disconnects from the `channel_peer`
- **PEER_BAN** - The event is a peer ban event, the `PEER_BAN` event type is used for events such when a peer is banned
from the `channel_peer`, in such cases the `from_peer` is the peer that banned the `to_peer` and the `to_peer`
is the peer that was banned.
- **PEER_UNBAN** - The event is a peer unban event, the `PEER_UNBAN` event type is used for events such when a peer is
unbanned from the `channel_peer`, in such cases the `from_peer` is the peer that unbanned the `to_peer` and the
`to_peer` is the peer that was unbanned.
- **PEER_KICK** - The event is a peer kick event, the `PEER_KICK` event type is used for events such when a peer is
kicked from the `channel_peer`, in such cases the `from_peer` is the peer that kicked the `to_peer` and the
`to_peer` is the peer that was kicked.
- **PEER_RESTRICT** - The event is a peer restrict event, the `PEER_RESTRICT` event type is used for events such when a
peer is restricted from the `channel_peer`, in such cases the `from_peer` is the peer that restricted the `to_peer`
and the `to_peer` is the peer that was restricted.
- **ANNOUNCEMENT** - The event is an announcement event, the `ANNOUNCEMENT` event type is used for events such as an
announcement from the `channel_peer` that was captured by the client, optionally, the `from_peer` field can be
used to specify the peer that made the announcement, if the `from_peer` field is not specified, the announcement
is assumed to be from the `channel_peer
Events are used to give the database context about what's happening in the channel, for example, a server with
anomaly detection may use the events to determine if a peer is sending too many messages in a short period of time
or a channel is being raided by a group of peers.
In all other cases `GENERAL` may be used if the client is simply doing RECON requests to keep the database up to date
with the peer's activity or status.
## QueryDocument Object
The QueryDocument is a JSON object which contains the following fields:
| Name | Type | Required | Example Value(s) | Description |
|-------------------------|--------------------------------------------|----------|-----------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `version` | `string` | Yes | `1` | The version of the QueryDocument, must always be "1" |
| `subject_type` | `string` | Yes | `RECON`, `ANALYZE` or `REPORT` | The subject type, must be "**RECON**", "**ANALYZE**" or "**REPORT**" |
| `client_id` | `string` | Yes | `00000000-0000-0000-0000-000000000000` | The client's unique UUIDv4 identifier to identify who the document is from |
| `client_totp_signature` | `string` | No | `a94a8fe5ccb19ba61c4c0873d391e987982fbbd3` | The client's TOTP signature to verify the client's identity with the document's reported timestamp, if the client does not use authentication, this field is not required, see [Client TOTP Signature](#client-totp-signature) |
| `timestamp` | `int` | No | `1614556800` | The timestamp of the document, if the client does not use authentication, this field is not required instead the server will use the server's timestamp. |
| `platform` | `string` | Yes | `telegram.org`, `email`, `discord.com`, etc. | The platform of the message or event, must be a supported service, see [Platform](#platform) |
| `event_type` | `string` | Yes | `incoming_message`, `peer_join`, `peer_leave`, etc. | The event type of the message or event that is represented by the document, this provides the server with context to detect anomalies in the document, see [Event Type](#event-type) |
| `channel_peer` | `string` | No | `telegram.chat:-1001301191379` | The channel peer represented as a Standard Federated Address in which the content was sent in, this could be a communication channel or chat room where one or more peers may use to broadcast messages on, see [Standard Federated Addresses](#standard-federated-addresses) |
| `resent_from_peer` | `string` | No | `telegram.user:123456789` | The resent from peer represented as a Standard Federated Address in which the content was resent/forwarded from, this could be a peer who has resent the content from another peer, see [Standard Federated Addresses](#standard-federated-addresses) |
| `from_peer` | `string` | No | `telegram.user:123456789` | The from peer represented as a Standard Federated Address in which the content was sent from, this could be a peer who has sent the content, see [Standard Federated Addresses](#standard-federated-addresses) |
| `to_peer` | `string` | No | `telegram.user:123456789` | The to peer represented as a Standard Federated Address in which the content was sent to, this could be the intended recipient of the content, see [Standard Federated Addresses](#standard-federated-addresses) |
| `proxy_peer` | `string` | No | `telegram.user:123456789` | The proxy peer represented as a Standard Federated Address in which the content was sent through a proxy, if identified as a proxy that could be mean `from_peer` is not the original sender of the content but instead the proxy peer is, see [Standard Federated Addresses](#standard-federated-addresses) |
| `peers` | [`Peer[]`](#querydocumentpeer) | No | N/A | The peer definitions of the document, see [Peer](#querydocumentpeer) |
| `content` | [`Content`](#querydocumentcontent) | No | N/A | The content of the document if applicable to the event type, see [Content](#querydocumentcontent) |
| `attachments` | [`Attachment[]`](#querydocumentattachment) | No | N/A | Optional attachments if the content contains any, see [Attachment](#querydocumentattachment) |
| `reports` | [`Report[]`](#querydocumentreport) | No | N/A | Optional reports if the content contains any, see [Report](#querydocumentreport) |
> **Note**: The `timestamp` field is not required if the client uses authentication, instead the server will use the server's timestamp to verify the document's timestamp, if the client does not use authentication, the `timestamp` field is required to verify the document's timestamp.
> **Note**: `peers` uses the Standard Federated Address as the key
> **Note**: `attachments` uses the File Name as the key
### QueryDocument.Peer Object
TODO: Write this section
### QueryDocument.Content Object
TODO: Write this section
### QueryDocument.Attachment Object
TODO: Write this section
### QueryDocument.Report Object
TODO: Write this section
# Platforms
TODO: Write this section
## Telegram
TODO: Write this section