Stream Data Generator

Built on top of the Data Generator, for scheduling continuous data streams from different schemas and sending it to different destinations such as message channels (Kafka, RabbitMQ), REST endpoints or JDBC databases.

anomaly detection logo

The stream-data-generator is still WIP, though being used to generate the test data for all streaming-runtime sample use cases.

Here is sample configuration to configure two parallel treads to generate and send data to two Kafka topics:

Quick Start

user.yaml

stream:
 data:
   generator:
     terminateAfter: 300s

     destinations:
       - name: kafka        
         bootstrap: 'localhost:9092'
         schemaRegistryServer: 'http://localhost:8081'

     streams:
       - streamName: kafka-stream-songs
         destination: kafka
         valueFormat: AVRO_SCHEMA_REGISTRY
         avroSchema: |-
           {
            "namespace": "com.tanzu.streaming.runtime.playsongs.avro",
            "type": "record",
            "name": "Song",
            "doc": "unique_on=song_id;to_share=song_id",
            "fields": [
                {"name": "song_id", "type": "long",   
                 "doc" : "#{number.number_between '1','1000'}"},
                {"name": "album",   "type": "string", 
                 "doc" : "#{ancient.hero} #{ancient.god}"},
                {"name": "artist",  "type": "string", 
                 "doc" : "#{artist.names}"},
                {"name": "name",    "type": "string", 
                 "doc" : "#{rock_band.name}"},
                {"name": "genre",   "type": "string", 
                 "doc" : "#{music.genres}"}
            ]
           }
         batch:
           size: 100
           initialDelay: 1ms
           messageDelay: 10ms


       - streamName: kafka-stream-playevents
         destination: kafka
         avroSchema: |-
           {
            "namespace": "com.tanzu.streaming.runtime.playsongs.avro",
            "type": "record",
            "name": "PlayEvent",
            "fields": [
              {"name": "song_id",  "type": "long", 
               "doc":"[[#shared.field('song.song_id')]]" },
              {"name": "duration", "type": "long", 
               "doc":"#{number.number_between '30000','1000000'}" }
            ]
           }
         valueFormat: JSON
         batch:
           size: 1
           initialDelay: 10ms
           delay: 100ms
           messageDelay: 100ms