What are the differences between publishOn and subscribeOn in Project Reactor?
Table of Contents
- Introduction
- 1.
**subscribeOn**
- Controlling the Initial Subscription Thread - 2.
**publishOn**
- Controlling the Thread for Downstream Operations - Key Differences Between
subscribeOn
andpublishOn
- Practical Considerations
- Conclusion
Introduction
In Project Reactor, thread management plays a crucial role in building efficient reactive applications. Since reactive programming is asynchronous and non-blocking, controlling where and how different parts of your reactive pipeline are executed is key to optimizing performance and resource utilization.
Two operators that help with thread management in Project Reactor are **subscribeOn**
and **publishOn**
. While both operators allow you to switch the thread context, they serve different purposes and behave in distinct ways within the reactive pipeline.
In this article, we’ll explore the key differences between **publishOn**
and **subscribeOn**
, and how they are used to manage threads in reactive programming.
1. **subscribeOn**
- Controlling the Initial Subscription Thread
The **subscribeOn**
operator is used to specify the thread on which the initial subscription of a reactive stream will occur. It affects the first operation in the stream, and all operations upstream (i.e., before it) will also run on the specified thread.
**subscribeOn**
sets the execution context for the entire stream, but only from the point of subscription onwards. This means that it governs the thread on which the entire chain of operations will run, but it only impacts the start of the stream.
Example of subscribeOn
:
In this example:
- The
**subscribeOn(Schedulers.boundedElastic())**
operator makes sure the entire pipeline starts on a bounded elastic thread pool, typically used for I/O-bound operations. - Even though the
**doOnNext**
is defined after**subscribeOn**
, it runs on the same thread, because**subscribeOn**
sets the execution thread for the entire pipeline.
Key Points of subscribeOn
:
- Affects the entire stream starting from the subscription point.
- Changes the thread used for subscription and upstream operations (operations before the subscription).
- Useful for controlling the thread pool for blocking operations at the start of the pipeline.
2. **publishOn**
- Controlling the Thread for Downstream Operations
The **publishOn**
operator, on the other hand, controls the thread context for downstream operations. It does not affect the thread used for the initial subscription but changes the thread context for all subsequent operations in the stream (i.e., all operations after it).
**publishOn**
is used to change the thread for downstream operations, enabling you to control how and where specific operations in the pipeline will run, without affecting upstream operations.
Example of publishOn
:
In this example:
**doOnNext**
before**publishOn**
runs on the calling thread (usually the main thread).**publishOn(Schedulers.parallel())**
switches the execution to a parallel thread pool for subsequent operations in the stream.- The second
**doOnNext**
runs on a different thread (from the parallel thread pool), even though both operations are part of the same stream.
Key Points of publishOn
:
- Affects downstream operations (i.e., operations after the point where it is used).
- Does not affect the initial subscription, but it controls the thread for everything that happens afterward.
- Useful for switching threads during the middle of a stream to offload work to a different pool (e.g., for CPU-bound tasks or parallelism).
Key Differences Between subscribeOn
and publishOn
Feature | **subscribeOn** | **publishOn** |
---|---|---|
Effect Scope | Affects the entire stream, starting from the point of subscription. | Affects only the downstream operations after the point where it is used. |
Usage | Used to specify the thread on which the subscription and all upstream operations run. | Used to change the thread for subsequent operations in the stream. |
Thread Impact | Affects subscription thread and all upstream operations. | Affects downstream operations after its placement. |
Use Case | When you want to change the thread where the stream is subscribed and initiated (e.g., I/O operations at the start). | When you want to change the thread for specific operations that follow it, such as parallel processing or offloading work. |
Typical Scenarios | When you need to perform blocking operations (e.g., reading from a file, network calls) or switch thread pools at the start. | When you need to offload CPU-bound work or ensure operations after a certain point run on a dedicated thread pool. |
Practical Considerations
1. Combining **subscribeOn**
and **publishOn**
Both **subscribeOn**
and **publishOn**
can be used together to achieve fine-grained control over thread execution in different parts of the stream.
Example:
In this case:
**subscribeOn(Schedulers.boundedElastic())**
starts the stream on a bounded elastic thread pool, which is ideal for I/O-bound tasks.**publishOn(Schedulers.parallel())**
switches the execution to a parallel thread pool for the downstream operations (i.e., those that happen after**publishOn**
).
2. Impact on Performance
**subscribeOn**
is typically used at the start of the pipeline to ensure that the entire reactive stream, including all upstream operations, runs on a specific thread pool.**publishOn**
allows for thread switching at any point in the pipeline. It’s best used when you need to isolate different parts of the stream to execute on separate thread pools (e.g., to separate I/O-bound and CPU-bound tasks).
3. Thread Safety
Thread safety considerations are important when switching threads:
- Be cautious about state sharing across threads, as different parts of the stream may run on different threads. Avoid shared mutable state unless it is properly synchronized.
Conclusion
Both **subscribeOn**
and **publishOn**
are powerful tools in Project Reactor for controlling thread execution within reactive streams.
- Use
**subscribeOn**
when you need to control which thread the entire stream starts on, typically for I/O-bound or blocking operations. - Use
**publishOn**
when you want to change the thread context for specific downstream operations after a certain point in the stream.
Understanding when and how to use these operators will help you manage threading in your reactive applications, ensuring better performance, scalability, and resource utilization.