Water scarcity is a critical challenge in the Souss-Massa region of Morocco, particularly in agricultural areas where traditional monitoring methods such as field visits are time-consuming and resource-intensive. Standard vegetation indices like NDVI often fail to capture the complexity of root-zone moisture stress, which is crucial for effective water management. This research introduces a novel Multi-Modal Attention U-Net architecture that integrates 16 spectral and environmental bands from multiple data sources to achieve precise agricultural land segmentation and water stress mapping. The model fuses optical imagery from Sentinel-2, radar data from Sentinel-1, soil moisture measurements from SMAP, and topographic data including Digital Elevation Models (DEM), slope, and precipitation patterns. By leveraging an attention mechanism, the network learns to focus on the most informative features for water stress detection. A weak supervision pipeline was developed on Google Earth Engine to generate training labels, enabling the model to achieve 93.6% accuracy and a Mean Intersection over Union (IoU) of 0.81 in segmenting agricultural areas and identifying water stress zones. The methodology has been applied to the Taroudant region in Morocco, demonstrating its practical utility for real-world water resource management and agricultural planning. The research contributes to advancing remote sensing techniques for environmental monitoring in water-scarce regions.

Figure 1: Study area in Taroudant region, Souss-Massa, Morocco
Geographic location and extent of the study area showing agricultural lands in the Taroudant region of Morocco. The area is characterized by semi-arid climate and intensive agricultural activities.

Figure 2: Multi-source remote sensing data integration
Visualization of the 16-band data fusion approach combining Sentinel-1 radar (VV, VH polarizations), Sentinel-2 optical bands, SMAP soil moisture, DEM, slope, and precipitation data.

Figure 3: Attention U-Net architecture diagram
Detailed architecture of the Multi-Modal Attention U-Net model showing encoder-decoder structure with skip connections and attention gates that help the network focus on relevant features for water stress detection.

Figure 4: Water stress mapping results
Classified water stress map showing different levels of agricultural water stress across the study area. Color-coded regions indicate varying degrees of water stress from low (green) to severe (red).

Figure 5: Agricultural land segmentation results
Comparison of ground truth and model predictions for agricultural land segmentation. The model achieves 93.6% accuracy with Mean IoU of 0.81.

Figure 6: Model performance metrics and validation results
Quantitative evaluation showing accuracy, precision, recall, F1-score, and IoU metrics across different validation sets. Confusion matrix and loss curves demonstrate model convergence and reliability.